BipRank: Ranking and Summarizing RDF Vocabulary Descriptions

25
ws .nju.ed u.cn BipRank: Ranking and Summarizing RDF Vocabulary Descriptions Gong Cheng 1 , Feng Ji 2 , Shengmei Luo 2 , Weiyi Ge 1 , Yuzhong Qu 1 1 State Key Laboratory for Novel Software Technology, Nanjing University, China 2 Communication Services R&D Institute, ZTE Corporation, China Presented at JIST2011

Transcript of BipRank: Ranking and Summarizing RDF Vocabulary Descriptions

Page 1: BipRank: Ranking and Summarizing RDF Vocabulary Descriptions

ws .nju.edu.cn

BipRank: Ranking and SummarizingRDF Vocabulary Descriptions

Gong Cheng1, Feng Ji2, Shengmei Luo2, Weiyi Ge1, Yuzhong Qu1

1State Key Laboratory for Novel Software Technology, Nanjing University, China2Communication Services R&D Institute, ZTE Corporation, China

Presented at JIST2011

Page 2: BipRank: Ranking and Summarizing RDF Vocabulary Descriptions

Gong Cheng (程龚 ) [email protected] 2 of 25

ws .nju.edu.cn

Outline

Introduction

Salience measurement

Vocabulary summarization

Conclusions

Page 3: BipRank: Ranking and Summarizing RDF Vocabulary Descriptions

Gong Cheng (程龚 ) [email protected] 3 of 25

ws .nju.edu.cn

Vocabularies and Linked Data

Linked Data

Vocabularies Your own vocabulary

Reuse

Page 4: BipRank: Ranking and Summarizing RDF Vocabulary Descriptions

Gong Cheng (程龚 ) [email protected] 4 of 25

ws .nju.edu.cn

Vocabulary search engines

Page 5: BipRank: Ranking and Summarizing RDF Vocabulary Descriptions

Gong Cheng (程龚 ) [email protected] 5 of 25

ws .nju.edu.cn

Vocabularies

Scale

Page 6: BipRank: Ranking and Summarizing RDF Vocabulary Descriptions

Gong Cheng (程龚 ) [email protected] 6 of 25

ws .nju.edu.cn

Vocabulary snippets --- state of the art

Page 7: BipRank: Ranking and Summarizing RDF Vocabulary Descriptions

Gong Cheng (程龚 ) [email protected] 7 of 25

ws .nju.edu.cn

Vocabulary snippets --- our approach

Page 8: BipRank: Ranking and Summarizing RDF Vocabulary Descriptions

Gong Cheng (程龚 ) [email protected] 8 of 25

ws .nju.edu.cn

Vocabulary summarization

Vocabulary summarization = ranking and selecting RDF sentences

Page 9: BipRank: Ranking and Summarizing RDF Vocabulary Descriptions

Gong Cheng (程龚 ) [email protected] 9 of 25

ws .nju.edu.cn

Outline

Introduction

Salience measurement

Vocabulary summarization

Conclusions

Page 10: BipRank: Ranking and Summarizing RDF Vocabulary Descriptions

Gong Cheng (程龚 ) [email protected] 10 of 25

ws .nju.edu.cn

A bipartite view of vocabulary description

Page 11: BipRank: Ranking and Summarizing RDF Vocabulary Descriptions

Gong Cheng (程龚 ) [email protected] 11 of 25

ws .nju.edu.cn

Surfer behavior --- type A

Page 12: BipRank: Ranking and Summarizing RDF Vocabulary Descriptions

Gong Cheng (程龚 ) [email protected] 12 of 25

ws .nju.edu.cn

Surfer behavior --- type B

Page 13: BipRank: Ranking and Summarizing RDF Vocabulary Descriptions

Gong Cheng (程龚 ) [email protected] 13 of 25

ws .nju.edu.cn

BipRank

type-A behavior

type-B behavior

Next step Current stepUniform?

Page 14: BipRank: Ranking and Summarizing RDF Vocabulary Descriptions

Gong Cheng (程龚 ) [email protected] 14 of 25

ws .nju.edu.cn

Pattern of RDF sentence

Page 15: BipRank: Ranking and Summarizing RDF Vocabulary Descriptions

Gong Cheng (程龚 ) [email protected] 15 of 25

ws .nju.edu.cn

p(s|u)

Frequency of Pattern(s)#RDF_sentence in the vocabulary that has the same pattern

Popularity of Pattern(s)#Vocabulary in the repository that has the same pattern

Page 16: BipRank: Ranking and Summarizing RDF Vocabulary Descriptions

Gong Cheng (程龚 ) [email protected] 16 of 25

ws .nju.edu.cn

Evaluation setting

Test cases9 moderate-sized vocabularies randomly selected from Falcons

Gold standardSalience given by 6 human experts

CompetitorsCp: Zhang et al. (WWW2007)

Our approachBipRank-U: pattern-unaware

BipRank-F: using pattern frequency

BipRank-P: using pattern popularity

MetricPearson product-moment correlation coefficient

Page 17: BipRank: Ranking and Summarizing RDF Vocabulary Descriptions

Gong Cheng (程龚 ) [email protected] 17 of 25

ws .nju.edu.cn

Evaluation results

Page 18: BipRank: Ranking and Summarizing RDF Vocabulary Descriptions

Gong Cheng (程龚 ) [email protected] 18 of 25

ws .nju.edu.cn

Outline

Introduction

Salience measurement

Vocabulary summarization

Conclusions

Page 19: BipRank: Ranking and Summarizing RDF Vocabulary Descriptions

Gong Cheng (程龚 ) [email protected] 19 of 25

ws .nju.edu.cn

Goodness of a summary

Salience

Query relevanceTextual similarity between query and summary

CohesionTerm overlap between RDF sentences

Page 20: BipRank: Ranking and Summarizing RDF Vocabulary Descriptions

Gong Cheng (程龚 ) [email protected] 20 of 25

ws .nju.edu.cn

Looking for the best summary

Multi-objective optimization

Single aggregate objective function

Solution: a greedy strategy

Page 21: BipRank: Ranking and Summarizing RDF Vocabulary Descriptions

Gong Cheng (程龚 ) [email protected] 21 of 25

ws .nju.edu.cn

Evaluation setting

Judges18 human experts

Test cases190 searches over 2,012 vocabularies crawled by Falcons

CompetitorsGeneric: Zhang et al. (WWW2007)

Our approachQR: query relevance

QR+S: query relevance + salience

QR+C: query relevance + cohesion

MetricRating on a 10-point scale

Page 22: BipRank: Ranking and Summarizing RDF Vocabulary Descriptions

Gong Cheng (程龚 ) [email protected] 22 of 25

ws .nju.edu.cn

Evaluation results

Page 23: BipRank: Ranking and Summarizing RDF Vocabulary Descriptions

Gong Cheng (程龚 ) [email protected] 23 of 25

ws .nju.edu.cn

Performance testing

Size of vocabulary

Size of summary Runtime

Page 24: BipRank: Ranking and Summarizing RDF Vocabulary Descriptions

Gong Cheng (程龚 ) [email protected] 24 of 25

ws .nju.edu.cn

Outline

Introduction

Salience measurement

Vocabulary summarization

Conclusions

Page 25: BipRank: Ranking and Summarizing RDF Vocabulary Descriptions

Gong Cheng (程龚 ) [email protected] 25 of 25

ws .nju.edu.cn

Conclusions

Salience measurementSentence-term graph

BipRank

Pattern of RDF sentence

Vocabulary summarizationSalience

Query relevance

Cohesion

Implemented in Falcons Ontology Searchhttp://ws.nju.edu.cn/falcons/ontologysearch/