Extracting Semantic Networks From Text Via Relational Clustering
description
Transcript of Extracting Semantic Networks From Text Via Relational Clustering
1
Extracting Semantic Networks From Text
Via Relational ClusteringStanley Kok
Dept. of Computer Science and Eng.University of Washington
Seattle, USA
Joint work with Pedro Domingos
Goal: Reading & Understanding Text
Semantic Network
countrycountry
invade
people emigrate to
embracereligion
Semantic Network Extractor
A Step Towards the Goal
TextRunner [Banko et al., IJCAI’07]
(Object,Relation,Object) triples
Webpages
2
Text Autonomous Agent
Knowledge Base
man(x)=>human(x)child(x,y)=>parent(y,x)
…
33
Snippet of Extracted Semantic Network
AmericaUS
USAAustraliaBritain
UKChinaSpainIraq
Germany…
EUEuropean_Union
UNUnited_Nations
armytroopsforceforcesnavy
exportimportimports
importation
partrole
pulledwithdrew
to_withdrawto_remove bannedhad_bannedprohibitedrestricted
banned
had_banned
prohibited
restricted playedhas_playedwill_play…playedhas_played
will_play…
4
Motivation Supervised approaches
Manual annotation of training data; not scalable to Web e.g., Semantic Parsing [Wong & Mooney, ACL’07]
Unsupervised approaches Extracts noisy & sparse ground facts;
no high-level knowledge that generalizes ground facts e.g., TextRunner [Banko et. al., IJCAI’07]
SNE Unsupervised, domain-independent Scales to Web Text ! simple semantic network
Abundance of Web text ! KB
Motivation Background Semantic Network Extractor Experiments Future Work
5
Overview
Motivation Background Semantic Network Extractor Experiments Future Work
6
Overview
7
Markov Logic A logical KB is a set of hard constraints
on the set of possible worlds Let’s make them soft constraints:
When a world violates a formula,it becomes less probable, not impossible
Give each formula a weight(Higher weight Stronger constraint)
satisfiesit formulas of weightsexpP(world)
8
TextRunner [Banko et al. IJCAI’07]
Extracts (object,relation,object) triples from webpages in a single pass
Identify nouns with noun phrase chunker Heuristically identify string between two
nouns as relation Classify each triple as true or false using
Naïve Bayes classifier
9
Motivation Background Semantic Network Extractor Experiments Future Work
Overview
10
Semantic Network Extractor Input: tuples r(x,y) Output: simple semantic network Clusters objects and relations simultaneously Number of clusters need not be specified in
advance Cluster relations by objects they relate and
vice versa
Cluster: , ,
Clustering: , ,
11
Notation
r2
r1
r3
r7 r6
r5
r4x1 x2
x3x4
x5
x6
y1y2y3
y4y5
12
Notation Atom: , , ,
Cluster combination:
r2
r1
r3
r7 r6
r5
r4x1 x2
x3x4
x5
x6
y1y2y3
y4y5
13
SNE Model Four rules Each symbol belongs to exactly one cluster
Exponential prior on #cluster combinations
Most symbols tend to be in different clusters
14
Atom prediction rule: Truth value of atom is determined by cluster combination it belongs to
SNE Model
Wt of rule is log-odds of atomin its cluster combination being true
#true & #false atomsin cluster combination
Smoothing parameters
15
Learning SNE ModelLearning consists of finding
Weights of atom prediction rules Cluster assignment (r,x,y)
assignment of truth values to , and atoms
vector of truth assignments to all observed ground atoms r(x,y)
that maximize log-posterior probability
first three rulesatom prediction rule
16
Log Posterior
Set of cluster combinations
#cluster combinations
#pairs of symbols in different clusters
constant
prob. atom is true prob. atom is false
Intractable!
17
Number of Cluster Combinations
space_shuttleColumbia
astronomersEarthplanet
Kennedy_Space_Centerdelivered_to
orbits
think_about
18
Number of Cluster Combinations
think_aboutastronomersEarthplanet
orbits
space_shuttleVoyager Kennedy_Space_Centerdelivered_to
19
Log-Posterior (Approximation)
Si set of symbols of type i
#false atoms in cluster comb. with only false atoms
Pr(atom=false)
Assume atoms in cluster combinations with only false atoms all belong to a single ‘default’ cluster combination
#cluster comb.with ¸ 1 true r(x,y) atom
Set of cluster comb.With ¸ 1 true r(x,y) atom
20
Search Algorithm Approximation: Hard assignment of
symbols to clusters
Searches over cluster assignments, evaluate each by its log-posterior
Agglomerative clustering Start with each r, x, y symbols in own cluster Merge pairs of clusters in bottom-up manner
orbits
revolves_aroundrevolves_around
orbits
21
Search Algorithm
Apollo 10
Earth
Odyssey
space_shuttle Moon
… …
orbitsrevolves_around
MoonMoonspace_shuttlespace_shuttle
Apollo 10Apollo 10
space_shuttle
OdysseyOdyssey
Apollo 10OdysseyApollo 10Odyssey
22
Search Algorithm
Earth
… …
orbitsrevolves_around
orbitsrevolves_around
Earth
orbits
revolves_around
23
Search Algorithm
Apollo 10
Earth
Odyssey
space_shuttle Moon
… …
24
Motivation Background Semantic Network Extractor Experiments Future Work
Overview
25
Dataset 2.1 million triples extracted in Web crawl by
TextRunner [Banko et al, IJCAI 2007] e.g., named_after(Jupiter,Roman_god),
upheld(Court,ruling), etc. 15,872 r symbols, 700,781 x symbols,
665,378 y symbols Only consider symbols appearing ¸25 times
10,214 r symbols, 8942 x symbols, 7995 y symbols 2,065,045 triples contain at least one such symbol
26
Comparison Systems Multiple Relational Clustering (MRC) [Kok & Domingos, ICML’07]
Similar to SNE Finds multiple clusterings Exponential prior on #clusters No symbols pairs tend to be in different clusters rule
Information-Theoretic Co-clustering (ITC) [Dhillon et al. , KDD’03]
Clusters data in 2D matrix along both dimensions Maximize mutual info b/w row & column clusters Extended it to 3D Extended it to use BIC prior on #cluster combinations
Infinite Relational Model (IRM) [Kemp et al., AAAI’06]
Generative model: Beta ! p ! Bernoulli ! Atoms Changed it to use CRP prior #cluster combination
Search algorithms changed to SNE’s agglomerative clustering
27
Evaluation Pairwise precision, recall, & F1 against manually
created gold standard 2688 r symbols, 2568 x symbols, 3058 y symbols
assigned to non-unit clusters 874 r non-unit clusters, 511 x non-unit clusters,
700 y non-unit clusters Remaining symbols assigned to unit clusters
Correct semantic statements Cluster combinations with ¸ 5 true ground r(x,y)
atoms
28
Parameter SettingsClosed-world assumption
triples not in DB are assumed false
SNE parameters: = = 100, pfalse = 0.9999 = 2.81 £ 10-9 , = 10 - , /(+) = fraction of true triples in dataset
Tried various parameters values for MRC, ITC, and IRM, and chose the best ones
29
SNE vs. MRCRelation
0
0.1
0.2
0.3
0.4
0.5
Precison Recall F1
SNEMRC
Object
00.10.20.30.40.50.6
Precision Recall F1
30
Relation
0
0.2
0.4
0.6
0.8
1
Precision Recall F1
SNEIRMITC
Object
0
0.2
0.4
0.6
0.8
Precision Recall F1
SNE vs. IRM vs. ITC
31
SNE vs. ITC vs. IRM
0
200
400
600
800
1000
1200
1400
Total Statements Number Correct
SNEIRMITC
0.778
0.8740.835
>2x >3x
32
SNE vs. ITC vs. IRMRun Time
01020304050607080
SNEIRMITCH
ours
33
SNE Full Joint Model vs.Separate Clustering
Relation
00.10.20.30.40.50.60.7
Precision Recall F1
SNESNE-Sep
Object
00.10.20.30.40.50.6
Precision Recall F1
34
SNE and WordNet
Compare SNE’s object clusters with WordNet 5000 object symbols overlaps with WordNet
Convert each node (synset) in WordNet taxonomy to contain children concepts too
Match SNE cluster to WordNet cluster with best F1 score
Lower the matched cluster in WordNet taxonomy, more precise the concept
35
Levels of Matched WordNet Clusters
0123456789
10
Leve
l
47
SNE Cluster Size
36 24 19 16 12 11 10 8 7 6 5 4 3 2
3636
Snippet of Extracted Semantic Network
brotherfather
motherparentscouplefamily friends
researchstudiesstudy
results
AmericaUS
AustraliaAustriaBritainIndia
Germany…
IslamJudaism
CatholicismChristianity
Protestanism
emigrated_torelocated_toescaped_to
moved_back_to
converted_toembraced
to_embrace
conducted_in
carried_out_in
37
Motivation Background Semantic Network Extractor Experiments Future Work
Overview
38
Future Work Integrate tuple extraction into SNE Learn richer semantic networks Learn logical theories Etc.
39
Conclusion SNE: unsupervised, domain-independent
approach, Text ! Simple semantic networkTakes us a step closer to “grand agenda” of
Text ! KB Based on Markov logic Techniques to scale SNE up to the Web Comparisons with other approaches show
promise of SNE