BipRank: Ranking and Summarizing RDF Vocabulary Descriptions
-
Upload
gong-cheng -
Category
Technology
-
view
515 -
download
1
Transcript of BipRank: Ranking and Summarizing RDF Vocabulary Descriptions
ws .nju.edu.cn
BipRank: Ranking and SummarizingRDF Vocabulary Descriptions
Gong Cheng1, Feng Ji2, Shengmei Luo2, Weiyi Ge1, Yuzhong Qu1
1State Key Laboratory for Novel Software Technology, Nanjing University, China2Communication Services R&D Institute, ZTE Corporation, China
Presented at JIST2011
Gong Cheng (程龚 ) [email protected] 2 of 25
ws .nju.edu.cn
Outline
Introduction
Salience measurement
Vocabulary summarization
Conclusions
Gong Cheng (程龚 ) [email protected] 3 of 25
ws .nju.edu.cn
Vocabularies and Linked Data
Linked Data
Vocabularies Your own vocabulary
Reuse
Gong Cheng (程龚 ) [email protected] 8 of 25
ws .nju.edu.cn
Vocabulary summarization
Vocabulary summarization = ranking and selecting RDF sentences
Gong Cheng (程龚 ) [email protected] 9 of 25
ws .nju.edu.cn
Outline
Introduction
Salience measurement
Vocabulary summarization
Conclusions
Gong Cheng (程龚 ) [email protected] 10 of 25
ws .nju.edu.cn
A bipartite view of vocabulary description
Gong Cheng (程龚 ) [email protected] 13 of 25
ws .nju.edu.cn
BipRank
type-A behavior
type-B behavior
Next step Current stepUniform?
Gong Cheng (程龚 ) [email protected] 15 of 25
ws .nju.edu.cn
p(s|u)
Frequency of Pattern(s)#RDF_sentence in the vocabulary that has the same pattern
Popularity of Pattern(s)#Vocabulary in the repository that has the same pattern
Gong Cheng (程龚 ) [email protected] 16 of 25
ws .nju.edu.cn
Evaluation setting
Test cases9 moderate-sized vocabularies randomly selected from Falcons
Gold standardSalience given by 6 human experts
CompetitorsCp: Zhang et al. (WWW2007)
Our approachBipRank-U: pattern-unaware
BipRank-F: using pattern frequency
BipRank-P: using pattern popularity
MetricPearson product-moment correlation coefficient
Gong Cheng (程龚 ) [email protected] 18 of 25
ws .nju.edu.cn
Outline
Introduction
Salience measurement
Vocabulary summarization
Conclusions
Gong Cheng (程龚 ) [email protected] 19 of 25
ws .nju.edu.cn
Goodness of a summary
Salience
Query relevanceTextual similarity between query and summary
CohesionTerm overlap between RDF sentences
Gong Cheng (程龚 ) [email protected] 20 of 25
ws .nju.edu.cn
Looking for the best summary
Multi-objective optimization
Single aggregate objective function
Solution: a greedy strategy
Gong Cheng (程龚 ) [email protected] 21 of 25
ws .nju.edu.cn
Evaluation setting
Judges18 human experts
Test cases190 searches over 2,012 vocabularies crawled by Falcons
CompetitorsGeneric: Zhang et al. (WWW2007)
Our approachQR: query relevance
QR+S: query relevance + salience
QR+C: query relevance + cohesion
MetricRating on a 10-point scale
Gong Cheng (程龚 ) [email protected] 23 of 25
ws .nju.edu.cn
Performance testing
Size of vocabulary
Size of summary Runtime
Gong Cheng (程龚 ) [email protected] 24 of 25
ws .nju.edu.cn
Outline
Introduction
Salience measurement
Vocabulary summarization
Conclusions
Gong Cheng (程龚 ) [email protected] 25 of 25
ws .nju.edu.cn
Conclusions
Salience measurementSentence-term graph
BipRank
Pattern of RDF sentence
Vocabulary summarizationSalience
Query relevance
Cohesion
Implemented in Falcons Ontology Searchhttp://ws.nju.edu.cn/falcons/ontologysearch/