Meta Structure: Compu/ng Relevance in Large Heterogeneous … · Nikos Mamoulis, Xiang Li, “Meta...
Transcript of Meta Structure: Compu/ng Relevance in Large Heterogeneous … · Nikos Mamoulis, Xiang Li, “Meta...
![Page 1: Meta Structure: Compu/ng Relevance in Large Heterogeneous … · Nikos Mamoulis, Xiang Li, “Meta Structure: Compu/ng Relevance in Large Heterogeneous Informaon Networks”, SIGKDD’](https://reader036.fdocuments.in/reader036/viewer/2022081522/5fc6ed80850b4268e654b6df/html5/thumbnails/1.jpg)
MetaStructure:Compu/ngRelevanceinLargeHeterogeneousInforma/onNetworks
ZhipengHuangh@p://i.cs.hku.hk/~zphuang/
![Page 2: Meta Structure: Compu/ng Relevance in Large Heterogeneous … · Nikos Mamoulis, Xiang Li, “Meta Structure: Compu/ng Relevance in Large Heterogeneous Informaon Networks”, SIGKDD’](https://reader036.fdocuments.in/reader036/viewer/2022081522/5fc6ed80850b4268e654b6df/html5/thumbnails/2.jpg)
Introduc/on
• Compu/ngrelevanceonnetworks(e.g.,socialnetwork,co-authornetwork)supportsmanyapplica/ons:– similaritysearch– recommenda/on
• Manymeasureshavebeenstudied:– Jaccardcoefficient,commonneighbors,shortestpath– PageRank, PersonalizePageRank, SimRank,etc.
![Page 3: Meta Structure: Compu/ng Relevance in Large Heterogeneous … · Nikos Mamoulis, Xiang Li, “Meta Structure: Compu/ng Relevance in Large Heterogeneous Informaon Networks”, SIGKDD’](https://reader036.fdocuments.in/reader036/viewer/2022081522/5fc6ed80850b4268e654b6df/html5/thumbnails/3.jpg)
HeterogeneousInforma/onNetwork
• HIN:Directedgraphwithmul/plenodetypesandedgetypes.
a1 a2 a3
p1,2p1,1 p2,1 p2,2 p3,2p3,1
v1 v2 v3 v4t1 t2 t3 t4
KDD “mining” AAAIVLDB “efficient” “privacy”
AAAI’15 VLDB’15KDD’15KDD’07
ICDM “social”
ICDM’12
write publishmention
VLDB’06
author paper venue topicobject types:
edge types:
![Page 4: Meta Structure: Compu/ng Relevance in Large Heterogeneous … · Nikos Mamoulis, Xiang Li, “Meta Structure: Compu/ng Relevance in Large Heterogeneous Informaon Networks”, SIGKDD’](https://reader036.fdocuments.in/reader036/viewer/2022081522/5fc6ed80850b4268e654b6df/html5/thumbnails/4.jpg)
MetaPath-BasedRelevanceMeasures
• MetaPath:asequenceofnodeandedgetypes.
• Measures:PathCount[1],PathSim[1]andPCRW[2]• Source:Automa/callygeneratemetapath(WWW’15)• Limita&on:Failtodiscovercommonnodes.– Example:Aresearcherwantstosearchforsomeauthorswhohavepublishedpapersinthesamevenueandinthesametopicwithhispapers.
![Page 5: Meta Structure: Compu/ng Relevance in Large Heterogeneous … · Nikos Mamoulis, Xiang Li, “Meta Structure: Compu/ng Relevance in Large Heterogeneous Informaon Networks”, SIGKDD’](https://reader036.fdocuments.in/reader036/viewer/2022081522/5fc6ed80850b4268e654b6df/html5/thumbnails/5.jpg)
LinearCombina/on
• R(a1,a2)= R(a1,a2|P1)+R(a1,a2|P2)= 1+1= 2• R(a2,a3)= R(a2,a3|P1)+R(a2,a3|P2)= 1+1= 2
a1 a2 a3
p1,2p1,1 p2,1 p2,2 p3,2p3,1
v1 v2 v3 v4t1 t2 t3 t4
KDD “mining” AAAIVLDB “efficient” “privacy”
AAAI’15 VLDB’15KDD’15KDD’07
ICDM “social”
ICDM’12
write publishmention
VLDB’06
author paper venue topicobject types:
edge types:
![Page 6: Meta Structure: Compu/ng Relevance in Large Heterogeneous … · Nikos Mamoulis, Xiang Li, “Meta Structure: Compu/ng Relevance in Large Heterogeneous Informaon Networks”, SIGKDD’](https://reader036.fdocuments.in/reader036/viewer/2022081522/5fc6ed80850b4268e654b6df/html5/thumbnails/6.jpg)
MetaStructure
• Apowerfulextensionofmetapath,adirectedacyclicgraph(DAG).
• MorePowerful.– Containmoreinforma/onthanametapath.Canexpressmoreseman/cmeaning.
• Challenges:– Howtodefinemeasuresbasedonmetastructure?– Morecomplexleadstohighcomputa/onalcost.– Howtoderiveametastructure?(Notyetstudiedwell)
![Page 7: Meta Structure: Compu/ng Relevance in Large Heterogeneous … · Nikos Mamoulis, Xiang Li, “Meta Structure: Compu/ng Relevance in Large Heterogeneous Informaon Networks”, SIGKDD’](https://reader036.fdocuments.in/reader036/viewer/2022081522/5fc6ed80850b4268e654b6df/html5/thumbnails/7.jpg)
RelevanceMeasures
• StructCount:extensionofPathCount• StructureConstrainedRandomWalk
• BiasedStructureConstrainedRandomWalk,acombina/onoftheprevioustwomeasures.
)|,()|,( 0000 SyxGraphInsSyxtStructCoun =
![Page 8: Meta Structure: Compu/ng Relevance in Large Heterogeneous … · Nikos Mamoulis, Xiang Li, “Meta Structure: Compu/ng Relevance in Large Heterogeneous Informaon Networks”, SIGKDD’](https://reader036.fdocuments.in/reader036/viewer/2022081522/5fc6ed80850b4268e654b6df/html5/thumbnails/8.jpg)
1.01.0
1.0 1.0
0.5
0.25
0.0
0.0
0.0
0.5 0.0
RecursiveTree
• TocalculatetheBSCSErelevanceofa2anda1:
a1 a2 a3
p1,2p1,1 p2,1 p2,2 p3,2p3,1
v1 v2 v3 v4t1 t2 t3 t4
KDD “mining” AAAIVLDB “efficient” “privacy”
AAAI’15 VLDB’15KDD’15KDD’07
ICDM “social”
ICDM’12
write publishmention
VLDB’06
author paper venue topicobject types:
edge types:
![Page 9: Meta Structure: Compu/ng Relevance in Large Heterogeneous … · Nikos Mamoulis, Xiang Li, “Meta Structure: Compu/ng Relevance in Large Heterogeneous Informaon Networks”, SIGKDD’](https://reader036.fdocuments.in/reader036/viewer/2022081522/5fc6ed80850b4268e654b6df/html5/thumbnails/9.jpg)
i-LTable
• Indextheprobabilitydistribu/onstar/ngfromthei-thlevelofametastructure.
a1 a2 a3
p1,2p1,1 p2,1 p2,2 p3,2p3,1
v1 v2 v3 v4t1 t2 t3 t4
KDD “mining” AAAIVLDB “efficient” “privacy”
AAAI’15 VLDB’15KDD’15KDD’07
ICDM “social”
ICDM’12
write publishmention
VLDB’06
author paper venue topicobject types:
edge types:
![Page 10: Meta Structure: Compu/ng Relevance in Large Heterogeneous … · Nikos Mamoulis, Xiang Li, “Meta Structure: Compu/ng Relevance in Large Heterogeneous Informaon Networks”, SIGKDD’](https://reader036.fdocuments.in/reader036/viewer/2022081522/5fc6ed80850b4268e654b6df/html5/thumbnails/10.jpg)
Experiment:En/tyResolu/on
• Tofindduplicateden//esinYAGO– Barack_ObamaandPresidency_Of_Barack_Obama
• Metric:AUC
P1 P2
Measure PathCount PCRW PathSim PathCount PCRW PathSim
AUC 0.1324 0.0120 0.0097 0.0003 0.0014 0.0002
LinearCombina/on(op/mal) MetaStructureS
Measure PathCount PCRW PathSim SC SCSE BSCSE*
AUC 0.2898 0.2606 0.2920 0.5556 0.5640 0.5640
![Page 11: Meta Structure: Compu/ng Relevance in Large Heterogeneous … · Nikos Mamoulis, Xiang Li, “Meta Structure: Compu/ng Relevance in Large Heterogeneous Informaon Networks”, SIGKDD’](https://reader036.fdocuments.in/reader036/viewer/2022081522/5fc6ed80850b4268e654b6df/html5/thumbnails/11.jpg)
RelevanceRanking
• WelabeltherelevanceofvenuesinDBLP_4_Area.
• 0fornotrelevant, 1forrelevantand2forstronglyrelevant.
• Considerbothscopeandlevelofthevenues.(likeSIGMODandVLDBare2)
• NormalizedDiscountedCumula/veGain(NDCG)
![Page 12: Meta Structure: Compu/ng Relevance in Large Heterogeneous … · Nikos Mamoulis, Xiang Li, “Meta Structure: Compu/ng Relevance in Large Heterogeneous Informaon Networks”, SIGKDD’](https://reader036.fdocuments.in/reader036/viewer/2022081522/5fc6ed80850b4268e654b6df/html5/thumbnails/12.jpg)
RelevanceRanking
P1 P2
Measure PathCount PCRW PathSim PathCount PCRW PathSim
nDCG 0.9004 0.9047 0.9083 0.8224 0.8901 0.8834
LinearCombina/on(op/mal) MetaStructureS
Measure PathCount PCRW PathSim SC SCSE BSCSE*
nDCG 0.9004 0.9100 0.9083 0.9056 0.9104 0.9130
![Page 13: Meta Structure: Compu/ng Relevance in Large Heterogeneous … · Nikos Mamoulis, Xiang Li, “Meta Structure: Compu/ng Relevance in Large Heterogeneous Informaon Networks”, SIGKDD’](https://reader036.fdocuments.in/reader036/viewer/2022081522/5fc6ed80850b4268e654b6df/html5/thumbnails/13.jpg)
Reference • [1]SunYizhou,etal.“Pathsim:Metapath-basedtop-k
similaritysearchinheterogeneousinforma/onnetworks."VLDB’11(2011).
• [2]Lao,Ni,andWilliamW.Cohen."Rela/onalretrievalusingacombina/onofpath-constrainedrandomwalks."Machinelearning81.1.010):53-67
• [3]Meng,Changping,etal."Discoveringmeta-pathsinlargeheterogeneousinforma/onnetworks."WWW’15.
• [4]ZhipengHuang,YudianZheng,ReynoldCheng,YizhouSun,NikosMamoulis,XiangLi,“MetaStructure:Compu/ngRelevanceinLargeHeterogeneousInforma/onNetworks”,SIGKDD’16