Coupling Semi-Supervised Learning of Categories and Relations
description
Transcript of Coupling Semi-Supervised Learning of Categories and Relations
![Page 1: Coupling Semi-Supervised Learning of Categories and Relations](https://reader035.fdocuments.in/reader035/viewer/2022070422/568164a1550346895dd69331/html5/thumbnails/1.jpg)
CS 652, Peter Lindes 1
Coupling Semi-Supervised Learning of Categories and Relations
Andrew Carlson, Justin Betteridge, Estevam R. Hruschka Jr., and Tom M. Mitchell
Carnegie Mellon University
9/18/2012
![Page 2: Coupling Semi-Supervised Learning of Categories and Relations](https://reader035.fdocuments.in/reader035/viewer/2022070422/568164a1550346895dd69331/html5/thumbnails/2.jpg)
CS 652, Peter Lindes 2
9/18/2012
The Problem“We present an approach to semi-supervised learning that yields more accurate results by coupling the training of many information extractors.”
![Page 3: Coupling Semi-Supervised Learning of Categories and Relations](https://reader035.fdocuments.in/reader035/viewer/2022070422/568164a1550346895dd69331/html5/thumbnails/3.jpg)
CS 652, Peter Lindes 39/18/2012
![Page 4: Coupling Semi-Supervised Learning of Categories and Relations](https://reader035.fdocuments.in/reader035/viewer/2022070422/568164a1550346895dd69331/html5/thumbnails/4.jpg)
CS 652, Peter Lindes
• Predefined Categories– Unary predicates (instances are noun phrases)– Mutually exclusive relationships– Some subset relationships– Flag: proper nouns, common nouns, or both– 10-20 seed instances – 5 seed patterns (automatically derived - Hearst, 1992)
• Predefined Relations– Binary predicates (an instance is a pair of noun phrases)– Mutually exclusive relationships– 10-20 seed instances– No seed patterns
9/18/2012
![Page 5: Coupling Semi-Supervised Learning of Categories and Relations](https://reader035.fdocuments.in/reader035/viewer/2022070422/568164a1550346895dd69331/html5/thumbnails/5.jpg)
CS 652, Peter Lindes 59/18/2012
The Predicates
![Page 6: Coupling Semi-Supervised Learning of Categories and Relations](https://reader035.fdocuments.in/reader035/viewer/2022070422/568164a1550346895dd69331/html5/thumbnails/6.jpg)
CS 652, Peter Lindes 6
• Taken from “a 200-million page web crawl”• Filtered for English “using a stop word ratio
threshold”• Filtered out web spam and adult content
“using a ‘bad word’ list”• Segmented, tokenized, and tagged• Noisy sentences filtered out• 514-million sentences used for experiment
9/18/2012
![Page 7: Coupling Semi-Supervised Learning of Categories and Relations](https://reader035.fdocuments.in/reader035/viewer/2022070422/568164a1550346895dd69331/html5/thumbnails/7.jpg)
CS 652, Peter Lindes 7
Evaluation• 3 Questions:– “Can CBL iterate many times and still achieve high
precision?”– “How helpful are the types of coupling that we
employ?”– “Can we extend existing semantic resources?”
• 3 Configurations– Full– NS: no sharing of promoted items, seeds shared– NCR: no type checking
9/18/2012
![Page 8: Coupling Semi-Supervised Learning of Categories and Relations](https://reader035.fdocuments.in/reader035/viewer/2022070422/568164a1550346895dd69331/html5/thumbnails/8.jpg)
CS 652, Peter Lindes 8
Results - Precision
9/18/2012
Iterations Full NS NCR
5 92 84 89
10 82 70 84
15 83 63 79
Iterations Full NS NCR
5 92 86 74
10 83 76 68
15 84 64 62
Categories
Relations
Precision estimated by human judging of correctness for 30 samples of each predicate.
![Page 9: Coupling Semi-Supervised Learning of Categories and Relations](https://reader035.fdocuments.in/reader035/viewer/2022070422/568164a1550346895dd69331/html5/thumbnails/9.jpg)
CS 652, Peter Lindes 9
Results - Recall
9/18/2012
Promoted categories and relations – 15 iterations
“At this stage of development, obtaining high recall is not a priority … it is our hope that high recall will come with time.”
![Page 10: Coupling Semi-Supervised Learning of Categories and Relations](https://reader035.fdocuments.in/reader035/viewer/2022070422/568164a1550346895dd69331/html5/thumbnails/10.jpg)
CS 652, Peter Lindes 10
Example Extracted Facts
9/18/2012
“We have presented a method of coupling the semi-supervised learning of categories and relations and demonstrated empirically that the coupling forestalls the problem of semantic drift associated with bootstrap learning methods.”
![Page 11: Coupling Semi-Supervised Learning of Categories and Relations](https://reader035.fdocuments.in/reader035/viewer/2022070422/568164a1550346895dd69331/html5/thumbnails/11.jpg)
CS 652, Peter Lindes 11
Comparison to Freebase
9/18/2012
“… our methods can contribute new facts to existing resources.”