Graph Summarization in Annotated Data using Probabilistic...
Transcript of Graph Summarization in Annotated Data using Probabilistic...
![Page 1: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/1.jpg)
Graph Summarization in Annotated Data using Probabilistic Soft Logic
Alex Memory1, Angelika Kimmig1,2, Stephen H. Bach1, Louiqa Raschid1 and Lise Getoor1
1University of Maryland2KU Leuven
Sunday, November 11, 12
![Page 2: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/2.jpg)
Agenda
• Annotation Graphs and Graph Summarization
• Heuristics for Graph Summarization
• Probabilistic Soft Logic
• Evaluation and Results
2
Sunday, November 11, 12
![Page 3: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/3.jpg)
Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038 / nrc2036
Xu Wanga, Yangyang Bianb, Kai Chengb, Li-Fei Gua, Mingliang Yeb, Hanfa Zoub, Jun-Xian Hea,, A large-scale protein phosphorylation analysis reveals novel phosphorylation motifs and phosphoregulatory networks in Arabidopsis, Journal of Proteomics, Oct 2012.
Anna Strum
illo, http://ww
w.fotopedia.com
/items/6nf9pniglhbor-oaPbkJEnrX
8
Plant Ontology Gene Ontology
Analysis of the genome sequence of the flowering plant Arabidopsis thalianaand The Arabidopsis Genome InitiativeNature 408, 796-815(14 December 2000)doi:10.1038/35048692
Annotated Data: the Arabidopsis thaliana Genome
3Sunday, November 11, 12
![Page 4: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/4.jpg)
The Arabidopsis Information Resource
http://www.arabidopsis.org/cgi-bin/bulk/go/getgo.pl 4Sunday, November 11, 12
![Page 5: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/5.jpg)
Graph Summarization Example 1 5Sunday, November 11, 12
![Page 6: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/6.jpg)
Graph Summarization Example 1 6
Thor, A., Anderson, P., Raschid, L., Navlakha, S., Saha, B., Khuller, S., Zhang, X.N.: Link prediction for annotation graphs using graph summarization. In: International Semantic Web Conference (ISWC). (2011)
Sunday, November 11, 12
![Page 7: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/7.jpg)
Graph Summarization Example 2 7Sunday, November 11, 12
![Page 8: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/8.jpg)
Graph Summarization Example 2 8Sunday, November 11, 12
![Page 9: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/9.jpg)
Heuristics for Graph Summarization
• Similarity
• Annotation Links
• Neighbor Sets
9
Sunday, November 11, 12
![Page 10: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/10.jpg)
Similarity Heuristic
Nodes in same cluster should
be similar
10Sunday, November 11, 12
![Page 11: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/11.jpg)
Annotation Link Heuristic
Shared neighbor ➔ same cluster
Neighbor not shared ➔ different clusters
11Sunday, November 11, 12
![Page 12: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/12.jpg)
Neighbor Sets Heuristic
Similar set of neighbors
➔ same cluster
b ca
b ca
b ca d
d e
d e12
Sunday, November 11, 12
![Page 13: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/13.jpg)
Affinity Propagation
• one exemplar per cluster
• every node has to choose an exemplar
Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315 (2007) 972–976 13
Sunday, November 11, 12
![Page 14: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/14.jpg)
Probabilistic Reasoningon Relational Data
• Fuzzy Logic [Lee, 1972]
• Undirected Graphical Models, FOL template language [Richardson et al, 2006]
• Probabilistic Ontologies [Costa et al, 2006]
14Lee, R.C.T.:Fuzzy logic and the resolution principle. J.ACM19(1)(1972)109–119
Richardson, M., Domingos, P.: Markov logic networks. Machine Learning 62(1-2) (2006) 107–136
Costa, P. C. G., and K. B. Laskey. “PR-OWL: A Framework for Probabilistic Ontologies.” Frontiers in Artificial Intelligence and Applications 150 (2006): 237.
Sunday, November 11, 12
![Page 15: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/15.jpg)
Probabilistic Soft Logic (PSL)• Declarative language for probabilistic reasoning
• First order rules
• Soft truth values
• Similarities
• Set similarity
• Efficient inference
• http://psl.umiacs.umd.edu/
Input Database
Rules+
Probability Distribution over Interpretations
15Broecheler, M., Mihalkova, L., Getoor, L.: Probabilistic similarity logic. In: Conference on Uncertainty in Artificial Intelligence (UAI). (2010)
Sunday, November 11, 12
![Page 16: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/16.jpg)
Graphs and Clusters in PSLPHOT1
cauline leaf
shoot apex
CRY2
vacuole
stomatal movement
response to water deprivation
PHOT1
cauline leaf
shoot apex
CRY2vacuole
stomatal movement
response to water deprivation
link(sa,p1)= 1link(sa,c2) = 1link(cl,p1) = 1
...link(c2,v) = 1link(c2,rw) = 1
similar(sa,cl) = .9similar(p1,c2) = .5similar(sm,v) = .1
...
exemplar(sa,cl)exemplar(cl,cl)exemplar(p1,p1)exemplar(c2,p1)exemplar(sm,v)exemplar(v,v)
exemplar(rq,rw)
16Sunday, November 11, 12
![Page 17: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/17.jpg)
GS Heuristics in PSL• Similarity Heuristic
1 : exemplar(A,B) -> similar(A,B)
A
B
17
Sunday, November 11, 12
![Page 18: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/18.jpg)
GS Heuristics in PSL• Annotation Link Heuristic
1 : link(A,B) & link(C,B) & exemplar(A,D) -> exemplar(C,D) 1 : link(A,B) & exemplar(A,C) & exemplar(D,C) -> link(C,B)
A
B
C
D
18
Sunday, November 11, 12
![Page 19: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/19.jpg)
GS Heuristics in PSL• Neighbor Sets Heuristic
1 : sameNeighbors({A.link}, {C.link}) & exemplar(A,B) -> exemplar(C,B)
B
A Y
C
Z
Y Z
19
Sunday, November 11, 12
![Page 20: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/20.jpg)
GS Heuristics in PSL• Affinity Propagation
1000 : exemplar(A,B) -> exemplar(B,B)
Functional: exemplar
20
Sunday, November 11, 12
![Page 21: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/21.jpg)
GS Heuristics in PSL• Similarity Heuristic
• Annotation Link Heuristic
• Neighbor Sets Heuristic
• Affinity Propagation1000 : exemplar(A,B) -> exemplar(B,B)
Functional: exemplar
1 : exemplar(A,B) -> similar(A,B)
1 : link(A,B) & link(C,B) & exemplar(A,D) -> exemplar(C,D) 1 : link(A,B) & exemplar(A,C) & exemplar(D,C) -> link(D,B)
1 : sameNeighbors({A.link}, {C.link}) & exemplar(A,B) -> exemplar(C,B)
21
Sunday, November 11, 12
![Page 22: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/22.jpg)
Distance to Satisfactionrule satisfied ⇔
truth value of body ≤ truth value of head
in1(a,b) -> out1(a,b) distance to satisfaction
0.5 0.6 0
0.9 0.6 0.30.7 0.6 0.1
1.0 0.0 1.0
22Sunday, November 11, 12
![Page 23: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/23.jpg)
Distance to Satisfactionrule satisfied ⇔
truth value of body ≤ truth value of head
out1(a,b) -> in1(a,b) distance to satisfaction
0.5 0.6 0
0.9 0.6 0.30.7 0.6 0.1
1.0 0.0 1.0
23Sunday, November 11, 12
![Page 24: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/24.jpg)
Probabilistic Model
f(I) =1
Zexp[�
X
r2R
�r(dr(I))p]
Z =
Z
Iexp[�
X
r2R
�r(dr(I))p]
Set of rule groundings
Rule’s weightInterpretation
Rule’s distance to satisfaction given I∈{1,2}
Normalization constant
24Sunday, November 11, 12
![Page 25: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/25.jpg)
MPE Inference in PSL• Convex optimization problem
• Exact inference in polynomial time by transformation to second order cone program
• New solver based on consensus optimization linear time in practice [Bach et al, NIPS 12]
25Bach, S., Broecheler, M., Getoor, L., O'Leary, D.. "Scaling MPE Inference for Constrained Continuous Markov Random Fields with Consensus Optimization" Advances in Neural Information Processing Systems (NIPS) - 2012
Sunday, November 11, 12
![Page 26: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/26.jpg)
Evaluation
Sunday, November 11, 12
![Page 27: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/27.jpg)
$7�*�����
*2B�������
*2B�������$7�*�����
32B�������
$7�*�����
*2B�������
$7�*�����$7�*�����
32B�������
$7�*�����
$7�*�����
32B�������
32B�������
$7�*�����
32B�������
32B�������
$7�*�����32B�������
32B�������
$7�*�����*2B�������
$7�*�����
*2B�������
32B�������
32B�������
$7�*�����
$7�*�����
32B�������
*2B�������
$7�*�����
32B�������
*2B������� 32B�������
*2B�������
$7�*�����
32B�������
$7�*�����
32B�������
*2B�������
32B�������
32B�������
*2B������� 32B�������$7�*�����
32B�������
32B�������
32B�������
*2B�������
32B������� 32B�������
$7�*�����
32B�������
32B�������
*2B�������
32B�������
32B�������
32B�������
32B�������
32B�������
32B�������
32B�������
*2B�������
*2B�������
32B�������
32B�������
32B�������
*2B�������
*2B�������
32B�������
32B�������
*2B�������
32B�������
*2B�������
32B�������
32B�������
*2B�������
Input Graph
Annotation Links
GO Taxonomic Distance
27
Gene Sequence Similarity
PO Taxonomic Distance
Sunday, November 11, 12
![Page 28: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/28.jpg)
$7�*�����
*2B�������
*2B�������$7�*�����
32B�������
$7�*�����
*2B�������
$7�*�����$7�*�����
32B�������
$7�*�����
$7�*�����
32B�������
32B�������
$7�*�����
32B�������
32B�������
$7�*�����32B�������
32B�������
$7�*�����*2B�������
$7�*�����
*2B�������
32B�������
32B�������
$7�*�����
$7�*�����
32B�������
*2B�������
$7�*�����
32B�������
*2B������� 32B�������
*2B�������
$7�*�����
32B�������
$7�*�����
32B�������
*2B�������
32B�������
32B�������
*2B������� 32B�������$7�*�����
32B�������
32B�������
32B�������
*2B�������
32B������� 32B�������
$7�*�����
32B�������
32B�������
*2B�������
32B�������
32B�������
32B�������
32B�������
32B�������
32B�������
32B�������
*2B�������
*2B�������
32B�������
32B�������
32B�������
*2B�������
*2B�������
32B�������
32B�������
*2B�������
32B�������
*2B�������
32B�������
32B�������
*2B�������
Leave-one-out Link Prediction
1. Leave one out
2. Cluster
3. Predict missing link using clusters
28
Sunday, November 11, 12
![Page 29: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/29.jpg)
Arabidopsis thalianaData Sets
DS1 DS2 DS3
Genes
PO Terms
GO Terms
PO-Gene
GO-Gene
10 10 18
53 48 40
44 31 19
255 255 218
157 157 92
29
Sunday, November 11, 12
![Page 30: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/30.jpg)
Results from Combining Heuristics
0
0.125
0.25
0.375
0.5
DS1 DS2 DS3
Mea
n A
vera
ge P
reci
sion
LINK1 LINK2SIM1 SIM2SETS
Cluster Link Prediction
LINK1
LINK2
SIM1
SIM2
SETS
Similarity, Shared Neigh.
Similarity, Shared Neigh.
Similarity, Neigh. ¬Shared
Similarity, Neigh. ¬Shared
Similarity Similarity, Shared Neigh.
Similarity Similarity, Neigh. ¬Shared
Similarity,Neigh. Sets
Similarity, Neigh. ¬Shared
30
Sunday, November 11, 12
![Page 31: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/31.jpg)
Results from Combining Heuristics
0
0.125
0.25
0.375
0.5
DS1 DS2 DS3
Mea
n A
vera
ge P
reci
sion
LINK1 LINK2SIM1 SIM2SETS
Cluster Link Prediction
LINK1
LINK2
SIM1
SIM2
SETS
Similarity, Shared Neigh.
Similarity, Shared Neigh.
Similarity, Neigh. ¬Shared
Similarity, Neigh. ¬Shared
Similarity Similarity, Shared Neigh.
Similarity Similarity, Neigh. ¬Shared
Similarity,Neigh. Sets
Similarity, Neigh. ¬Shared Random Guess: ~0.004
31
Sunday, November 11, 12
![Page 32: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/32.jpg)
Results from Combining Heuristics
0
0.125
0.25
0.375
0.5
DS1 DS2 DS3
Mea
n A
vera
ge P
reci
sion
LINK1 LINK2SIM1 SIM2SETS
Cluster Link Prediction
LINK1
LINK2
SIM1
SIM2
SETS
Similarity, Shared Neigh.
Similarity, Shared Neigh.
Similarity, Neigh. ¬Shared
Similarity, Neigh. ¬Shared
Similarity Similarity, Shared Neigh.
Similarity Similarity, Neigh. ¬Shared
Similarity,Neigh. Sets
Similarity, Neigh. ¬Shared Random Guess: ~0.004
32
Sunday, November 11, 12
![Page 33: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/33.jpg)
Results from Combining Heuristics
0
0.125
0.25
0.375
0.5
DS1 DS2 DS3
Mea
n A
vera
ge P
reci
sion
LINK1 LINK2SIM1 SIM2SETS
Cluster Link Prediction
LINK1
LINK2
SIM1
SIM2
SETS
Similarity, Shared Neigh.
Similarity, Shared Neigh.
Similarity, Neigh. ¬Shared
Similarity, Neigh. ¬Shared
Similarity Similarity, Shared Neigh.
Similarity Similarity, Neigh. ¬Shared
Similarity,Neigh. Sets
Similarity, Neigh. ¬Shared Random Guess: ~0.004
Baseline System33
Thor, A., Anderson, P., Raschid, L., Navlakha, S., Saha, B., Khuller, S., Zhang, X.N.: Link prediction for annotation graphs using graph summarization. In: International Semantic Web Conference (ISWC). (2011)
Sunday, November 11, 12
![Page 34: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/34.jpg)
Results from Combining Heuristics
0
0.125
0.25
0.375
0.5
DS1 DS2 DS3
Mea
n A
vera
ge P
reci
sion
LINK1 LINK2SIM1 SIM2SETS
Cluster Link Prediction
LINK1
LINK2
SIM1
SIM2
SETS
Similarity, Shared Neigh.
Similarity, Shared Neigh.
Similarity, Neigh. ¬Shared
Similarity, Neigh. ¬Shared
Similarity Similarity, Shared Neigh.
Similarity Similarity, Neigh. ¬Shared
Similarity,Neigh. Sets
Similarity, Neigh. ¬Shared Random Guess: ~0.004
Baseline System34
Thor, A., Anderson, P., Raschid, L., Navlakha, S., Saha, B., Khuller, S., Zhang, X.N.: Link prediction for annotation graphs using graph summarization. In: International Semantic Web Conference (ISWC). (2011)
Sunday, November 11, 12
![Page 35: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/35.jpg)
Conclusion
• Simple heuristics → competitive results
• Probabilistic Soft Logic for general reasoning over uncertainty
35
Sunday, November 11, 12
![Page 36: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/36.jpg)
Thank you!http://psl.umiacs.umd.edu
Sunday, November 11, 12
![Page 37: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/37.jpg)
Backup
Sunday, November 11, 12
![Page 38: Graph Summarization in Annotated Data using Probabilistic ...c4i.gmu.edu/ursw/2012/files/...presentation.pdf · Hu et al. Nature Reviews Cancer 7, 23–34 (January 2007) | doi:10.1038](https://reader036.fdocuments.in/reader036/viewer/2022090603/60568ba48eed505fa15aa857/html5/thumbnails/38.jpg)
Distance to Satisfactiondr
(I) = max{0, I(rbody
)� I(rhead
)}using Lukasiewicz norms:
I(v1 ^ v2) = max{0, I(v1) + I(v2)� 1}I(v1 _ v2) = min{I(v1) + I(v2), 1}
I(¬l1) = 1� I(v1)
1 : link(a,b) & exemplar(a,a) & exemplar(c,a) -> link(c,b)
1 0.9 0.8 0
max{0,1+0.9+0.8−2}=0.7 d=0.71 0.5 0.3 0
max{0,1+0.5+0.3−2}=0 d=038
Sunday, November 11, 12