Semantic Enrichment of Text with Background Knowledge
description
Transcript of Semantic Enrichment of Text with Background Knowledge
Semantic Enrichment of Text with Background
KnowledgeAnselmo
PeñasNLP & IR Group
UNED
nlp.uned.es
Eduard Hovy USC / ISI
isi.edu
UNED
nlp.uned.es
Text omits information
San Francisco's Eric Davis intercepted a Steve Walsh pass on the next series to set up a seven-yard Young touchdown pass to Brent Jones.
UNED
nlp.uned.es
Make explicit implicit information
Implicit (More) explicitSan Francisco’s Eric Davis Eric Davis plays for San Francisco
E.D. is a player, S.F. is a teamEric Davis intercepted
pass1
-
Steve Walsh pass1 Steve Walsh threw pass1Steve Walsh threw interception1…
Young touchdown pass2 Young completed pass2 for touchdown…
touchdown pass2 to Brent Jones
Brent Jones caught pass2 for touchdown
San Francisco's Eric Davis intercepted a Steve Walsh pass on the next series to set up a seven-yard Young touchdown pass to Brent Jones.
UNED
nlp.uned.es
Goals
General Goal Automatic recovering of such
omitted information
Enrichment is the process of adding explicitly to a text’s representation the information that is either implicit or missing in the text
UNED
nlp.uned.es
The enrichment cycleCycle:1. Read text from collection2. Ruminate in BKB3. Enrich text representation4. Repeat
DomainDocs.
ReadingBackgroun
d Knowledge
Base
Rumination
Enrichment
UNED
nlp.uned.es
GoalsSpecific goals of this work
Explore the idea of using “Proposition Stores” as Background Knowledge for enrichment
Explore procedures for enrichment
Determine the kinds of knowledge that Proposition Stores must include to enable enrichment
UNED
nlp.uned.es
Outline
1. Intro2. BKB3. Enrichment4. Features of BKBs for Enrichment5. Conclusion
UNED
nlp.uned.es
Elements in our BKB
Entities• Classes: not limited to a predefined set• Instances: proper nouns (in this first
approach)• Class:has-instance:Instance relations
Propositions: Predefined syntactic structures
• NV, NVPN• NVN, NVNPN• NPN, AN• …
UNED
nlp.uned.es
Extraction of propositions
Patterns over dependency treesprop( Type, Form : DependencyConstrains :
NodeConstrains ).
Examples:prop(nv, [N,V] : [V:N:nsubj, not(V:_:'dobj')] : [verb(V)]).
prop(nvnpn, [N1,V,N2,P,N3]:[V:N2:'dobj', V:N3:Prep, subj(V,N1)]:[prep(Prep,P)]).
prop(has_value, [N,Val]:[N:Val:_]:[nn(N), cd(Val), not(lemma(Val,'one'))]).
UNED
nlp.uned.es
Background Knowledge Base(NFL, US football)
?> NN NNP:’pass’
NN 24 'Marino’:'pass‘
NN 17 'Kelly':'pass'NN 15
'Elway’:'pass’
…
?>X:has-instance:’Marino’20 'quarterback':has-
instance:'Marino'6 'passer':has-instance:'Marino'4 'leader':has-instance:'Marino'3 'veteran':has-
instance:'Marino'2 'player':has-instance:'Marino'
?> NPN 'pass':X:'touchdown‘
NPN 712 'pass':'for':'touchdown'
NPN 24 'pass':'include':'touchdown’
…
?> NVN 'quarterback':X:'pass'
NVN 98 'quarterback':'throw':'pass'
NVN 27 'quarterback':'complete':'pass‘
…
?> NVNPN 'NNP':X:'pass':Y:'touchdown'NVNPN 189
'NNP':'catch':'pass':'for':'touchdown'NVNPN 26
'NNP':'complete':'pass':'for':'touchdown‘…
?> NVN 'end':X:'pass‘
NVN 28 'end':'catch':'pass'
NVN 6 'end':'drop':'pass‘
…
UNED
nlp.uned.es
Outline
1. Intro2. BKB3. Enrichment4. Features of BKBs for Enrichment5. Conclusion
UNED
nlp.uned.es
Enrichment example (1)…to set up a 7-yard Young touchdown pass to Brent
Jones
pass
Young touchdown Jones
nn nn to
Young pass?> X:has-instance:Young
X=quarterback?>
NVN:quarterback:X:passX=throwX=complete
pass to Jones?> X:has-
instance:JonesX=end
?> NVN:end:X:passX=catchX=drop
UNED
nlp.uned.es
Enrichment example (2)
pass
Young touchdown Jones
throwcomplete
nn catchdrop
touchdown pass?> NVN touchdown:X:pass
False?> NPN pass:X:touchdown
X=for
…to set up a 7-yard Young touchdown pass to Brent Jones
UNED
nlp.uned.es
Enrichment example (3)
pass
Young touchdown Jones
throwcomplete
for catchdrop
?> NVNPN NAME:X:pass:for:touchdownX=completeX=catch
…to set up a 7-yard Young touchdown pass to Brent Jones
UNED
nlp.uned.es
Enrichment example (4)
pass
Young touchdown Jones
complete for catch
Young complete pass for touchdown Jones catch pass for touchdown
…to set up a 7-yard Young touchdown pass to Brent Jones
UNED
nlp.uned.es
Enrichment Build context for instances Build context for dependencies
Finding prepositionsFinding verbs
Constrain interpretations
UNED
nlp.uned.es
Enrichment example (5)San Francisco's Eric Davis intercepted a Steve Walsh pass on the next series to set up a seven-yard Young touchdown pass to Brent Jones.
Before enrichment
forthrow catchcomplete
After enrichment
UNED
nlp.uned.es
Outline
1. Intro2. BKB3. Enrichment4. Features of BKBs for Enrichment5. Conclusion
UNED
nlp.uned.es
What BKBs need for enrichment? (1)Ability to answer about instances
• Not complete population• But allow analogy
Ability to constrain interpretations and accumulate evidence
• Several different queries over the same elements considering different syntactic structures
• Require normalization (and parsing)
UNED
nlp.uned.es
What BKBs need for enrichment? (1)Ability to discover entity classes with
appropriate granularity level• Quarterbacks throw passes• Ends catch passes• Tag an entity as person or even player is
not specific enough for enrichment
Text frequently introduces the relevant class (appropriate granularity level) for understanding
UNED
nlp.uned.es
What BKBs need for enrichment? (2)Ability to digest enough knowledge
adapted to the domain• Crucial
Approaches• Macro-reading (web scale) + domain
adaptation• Shallow NLP, lack of normalization
• Reading in context (suggested here)• Domain partitioning• Deeper NLP, specific domain NLP
UNED
nlp.uned.es
Digest enough knowledge
DART: general domain propositions storeTextRunner: general domain (web-scale)BKB: specific domain propositions store (only
30,000 docs)
?> quarterback:X:passDART TextRunner BKB (US Football)
(no results) (~200) threw (~100) completed (36) to throw (26) has thrown (19) makes (19) has (18) fires
(99) throw(25) complete(7) have(5) attempt(5) not-throw(4) toss(3) release
UNED
nlp.uned.es
?> X:intercept:passDART TextRunner BKB (US
Football)(13) person (6) person/place/organization(2) full-back(1) place
(30) Early (26) Two plays
(24) fumble (20) game (20) ball (17) Defensively
(75) person(14) cornerback(11) defense(8) safety(7) group(5) linebacker
Digest Knowledge in the domain(entity classes)
UNED
nlp.uned.es
Digest Knowledge in the domain(ambiguity problem)
?> person:X:passDART TextRunner BKB (US
Football)(47) make (45) take (36) complete (30) throw (25) let (23) catch (1) make (1) expect
(22) gets (17) makes (10) has (10) receives (7) who has (7) must have (6) acting on (6) to catch (6) who buys (5) bought (5) admits (5) gives
(824) catch(546) throw(256) complete(136) have(59) intercept(56) drop(39) not-catch(37) not-throw(36) snare(27) toss(23) pick off(20) run
UNED
nlp.uned.es
Domain issue
?> person:X:passNFL Domain
905:nvn:[person:n, catch:v, pass:n].667:nvn:[person:n, throw:v, pass:n].286:nvn:[person:n, complete:v, pass:n].204:nvnpn:[person:n, catch:v, pass:n, for:in,
yard:n].85:nvnpn:[person:n, catch:v, pass:n, for:in, touchdown:n].
IC Domain6:nvn:[person:n, have:v, pass:n]3:nvn:[person:n, see:v, pass:n]1:nvnpn:[person:n, wear:v, pass:n, around:in,
neck:n]
BIO Domain<No results>
UNED
nlp.uned.es
Domain issue?> X:receive:Y
NFL Domain55:nvn:[person:n, receive:v, call:n].34:nvn:[person:n, receive:v, offer:n].33:nvn:[person:n, receive:v, bonus:n].29:nvn:[team:class, receive:v, pick:n].
IC Domain78 nvn:[person:n, receive:v, call:n]44 nvn:[person:n, receive:v, letter:n]35 nvn:[group:n, receive:v, information:n]31 nvn:[person:n, receive:v, training:n]
BIO Domain24 nvn:[patients:n, receive:v, treatment:n]14 nvn:[patients:n, receive:v, therapy:n]13 nvn:[patients:n, receive:v, care:n]
UNED
nlp.uned.es
Outline
1. Intro2. BKB3. Enrichment4. Features of BKBs for Enrichment5. Conclusion
UNED
nlp.uned.es
Conclusions Limiting to a specific domain provides some powerful
benefits Ambiguity is reduced Higher density of relevant propositions Different distribution of propositions across domains Amount of source text is reduced, allowing deeper
processing such as parsing Specific tools for specific domains
Proposition stores seem to be useful Improve parsing, corref, WSD,…
We presented a new application: ENRICHMENT
UNED
nlp.uned.es
Current work Develop automatic procedures for
EnrichmentNeed better Proposition Stores
• Selectional Preferences• Lexical relatedness• Structural /frame transformations• …
UNED
nlp.uned.es
Future work Develop appropriate
methodologies for evaluationIntrinsic?Extrinsic: QA over single
documents?• Reading comprehension tests?
Thanks!
UNED
nlp.uned.es
NVN 3 'quarterback':'find':'receiver‘NVNPN 3 'quarterback':'throw':'pass':'to':'receiver'NVNPN 2 'quarterback':'complete':'pass':'to':'receiver'NVNPN 1 'receiver':'catch':'pass':'from':'quarterback‘
nvn:('NNP':'quarterback'):'hit':('NNP':'receiver'),177).nvnpn:('NNP':'quarterback'):'throw':'pass':'to':
('NNP':'receiver'),143).nvnpn:('NNP':'quarterback'):'complete':'pass':'to':
('NNP':'receiver'),79).nvn:('NNP':'quarterback'):'find':('NNP':'receiver'),69).nvnpn:('NNP':'receiver'):'catch':'pass':'from':
('NNP':'quarterback'),43).