Semi-Supervised Learning With Graphs
-
Upload
cordelia-weeks -
Category
Documents
-
view
60 -
download
0
description
Transcript of Semi-Supervised Learning With Graphs
![Page 1: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/1.jpg)
1
Semi-Supervised Learning With Graphs
William Cohen
![Page 2: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/2.jpg)
2
Review – Graph Algorithms so far….
• PageRank and how to scale it up• Personalized PageRank/Random Walk with Restart and
–how to implement it–how to use it for extracting part of a graph
• Other uses for graphs?–not so much
We might come back to this more
You can also look at the March 19 lecture from the spring 2015 version of this class.
HW6
![Page 3: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/3.jpg)
3
Main topics today
• Scalable semi-supervised learning on graphs–SSL with RWR–SSL with coEM/wvRN/HF
• Scalable unsupervised learning on graphs–Power iteration clustering–…
![Page 4: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/4.jpg)
4
Semi-supervised learning
• A pool of labeled examples L• A (usually larger) pool of unlabeled examples U• Can you improve accuracy somehow using U?
![Page 5: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/5.jpg)
5
Semi-Supervised Bootstrapped Learning/Self-training
ParisPittsburgh
SeattleCupertino
mayor of arg1live in arg1
San FranciscoAustindenial
arg1 is home oftraits such as arg1
anxietyselfishness
Berlin
Extract cities:
![Page 6: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/6.jpg)
6
Semi-Supervised Bootstrapped Learning via Label Propagation
Paris
live in arg1
San FranciscoAustin
traits such as arg1
anxiety
mayor of arg1
Pittsburgh
Seattle
denial
arg1 is home of
selfishness
![Page 7: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/7.jpg)
7
Semi-Supervised Bootstrapped Learning via Label Propagation
Paris
live in arg1
San FranciscoAustin
traits such as arg1
anxiety
mayor of arg1
Pittsburgh
Seattle
denial
arg1 is home of
selfishness
Nodes “near” seeds Nodes “far from” seeds
Information from other categories tells you “how far” (when to stop propagating)
arrogancetraits such as arg1
denialselfishnes
s
![Page 8: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/8.jpg)
8
ASONAM-2010 (Advances in Social Networks Analysis and
Mining)
![Page 9: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/9.jpg)
9
Network Datasets with Known Classes
•UBMCBlog•AGBlog•MSPBlog•Cora•Citeseer
![Page 10: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/10.jpg)
10
RWR - fixpoint of:
Seed selection1. order by PageRank, degree, or randomly2. go down list until you have at least k
examples/class
![Page 11: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/11.jpg)
11
Results – Blog data
Random Degree PageRank
We’ll discuss
this soon….
![Page 12: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/12.jpg)
12
Results – More blog data
Random Degree PageRank
![Page 13: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/13.jpg)
13
Results – Citation data
Random Degree PageRank
![Page 14: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/14.jpg)
14
Seeding – MultiRankWalk
![Page 15: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/15.jpg)
15
Seeding – HF/wvRN
![Page 16: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/16.jpg)
16
What is HF aka coEM aka wvRN?
![Page 17: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/17.jpg)
17
CoEM/HF/wvRN• One definition [MacKassey & Provost, JMLR 2007]:…
Another definition: A harmonic field – the score of each node in the graph is the harmonic (linearly weighted) average of its neighbors’ scores;
[X. Zhu, Z. Ghahramani, and J. Lafferty, ICML 2003]
![Page 18: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/18.jpg)
18
CoEM/wvRN/HF• Another justification of the same algorithm….• … start with co-training with a naïve Bayes learner
![Page 19: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/19.jpg)
19
CoEM/wvRN/HF• One algorithm with several justifications….• One is to start with co-training with a naïve Bayes learner
• And compare to an EM version of naïve Bayes– E: soft-classify unlabeled examples with NB classifier – M: re-train classifier with soft-labeled examples
![Page 20: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/20.jpg)
20
CoEM/wvRN/HF• A second experiment
– each + example: concatenate features from two documents, one of class A+, one of class B+– each - example: concatenate features from two documents, one of class A-, one of class B-– features are prefixed with “A”, “B” disjoint
![Page 21: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/21.jpg)
21
CoEM/wvRN/HF• A second experiment
– each + example: concatenate features from two documents, one of class A+, one of class B+– each - example: concatenate features from two documents, one of class A-, one of class B-– features are prefixed with “A”, “B” disjoint
• NOW co-training outperforms EM
![Page 22: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/22.jpg)
22
CoEM/wvRN/HF
• Co-training with a naïve Bayes learner
• vs an EM version of naïve Bayes– E: soft-classify unlabeled examples with NB classifier – M: re-train classifier with soft-labeled examples
incremental hard assignments
iterative soft assignments
![Page 23: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/23.jpg)
23
Co-Training Rote Learner
My advisor+
-
-
pageshyperlinks
-
--
-
+
+
-
-
-
+
++
+
-
-
+
+
-
![Page 24: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/24.jpg)
24
Co-EM Rote Learner: equivalent to HF on a bipartite graph
Pittsburgh+
-
-
contextsNPs
-
--
-
+
+
-
-
-
+
++
+
-
-
+
+
-
lives in _
![Page 25: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/25.jpg)
25
What is HF aka coEM aka wvRN?
Algorithmically:
• HF propagates weights and then resets the seeds to their initial value
• MRW propagates weights and does not reset seeds
![Page 26: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/26.jpg)
26
MultiRank Walk vs HF/wvRN/CoEM
Seeds are marked S
MRWHF
![Page 27: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/27.jpg)
27
Back to Experiments: Network Datasets with Known Classes
•UBMCBlog•AGBlog•MSPBlog•Cora•Citeseer
![Page 28: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/28.jpg)
28
MultiRankWalk vs wvRN/HF/CoEM
![Page 29: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/29.jpg)
29
How well does MWR work?
![Page 30: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/30.jpg)
30
Parameter Sensitivity
![Page 31: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/31.jpg)
31
Semi-supervised learning
• A pool of labeled examples L• A (usually larger) pool of unlabeled examples U• Can you improve accuracy somehow using U?• These methods are different from EM
– optimizes Pr(Data|Model)• How do SSL learning methods (like label propagation) relate to optimization?
![Page 32: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/32.jpg)
32
SSL as optimizationslides from Partha Talukdar
![Page 33: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/33.jpg)
33
![Page 34: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/34.jpg)
34
yet another name for HF/wvRN/coEM
![Page 35: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/35.jpg)
35
match seeds smoothnessprior
![Page 36: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/36.jpg)
36
![Page 37: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/37.jpg)
37
![Page 38: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/38.jpg)
38
![Page 39: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/39.jpg)
39
How to do this minimization?First, differentiate to find min is at
Jacobi method:• To solve Ax=b for x• Iterate:
• … or:
![Page 40: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/40.jpg)
40
![Page 41: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/41.jpg)
41
![Page 42: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/42.jpg)
42
precision-recall
break even point
/HF/…
![Page 43: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/43.jpg)
43
/HF/…
![Page 44: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/44.jpg)
44
/HF/…
![Page 45: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/45.jpg)
45
from mining patterns like “musicians such as Bob Dylan”
from HTML tables on the web that are
used for data, not
formatting
![Page 46: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/46.jpg)
46
![Page 47: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/47.jpg)
47
![Page 48: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/48.jpg)
48
More recent work (AIStats 2014)• Propagating labels requires usually small number of optimization passes
– Basically like label propagation passes• Each is linear in
– the number of edges – and the number of labels being propagated
• Can you do better?– basic idea: store labels in a countmin sketch– which is basically an compact approximation of an objectdouble mapping
![Page 49: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/49.jpg)
Flashback: CM Sketch Structure
Each string is mapped to one bucket per row Estimate A[j] by taking mink { CM[k,hk(j)] } Errors are always over-estimates Sizes: d=log 1/, w=2/ error is usually less than ||A||
1
+c
+c
+c
+c
h1(s)
hd(s)
<s, +c>
d=log 1/
w = 2/
from: Minos Garofalakis
49
![Page 50: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/50.jpg)
50
More recent work (AIStats 2014)• Propagating labels requires usually small number of optimization passes
– Basically like label propagation passes• Each is linear in
– the number of edges – and the number of labels being propagated– the sketch size– sketches can be combined linearly without “unpacking” them: sketch(av + bw) = a*sketch(v)+b*sketch(w)– sketchs are good at storing skewed distributions
![Page 51: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/51.jpg)
51
More recent work (AIStats 2014)• Label distributions are often very skewed
–sparse initial labels–community structure: labels from other subcommunities have small weight
![Page 52: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/52.jpg)
52
More recent work (AIStats 2014)
Freebase Flick-10k
“self-injection”: similarity computation
![Page 53: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/53.jpg)
53
More recent work (AIStats 2014)
Freebase
![Page 54: Semi-Supervised Learning With Graphs](https://reader035.fdocuments.in/reader035/viewer/2022062300/56812a68550346895d8debed/html5/thumbnails/54.jpg)
54
More recent work (AIStats 2014)
100 Gb available