Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents
description
Transcript of Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents
![Page 1: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/1.jpg)
Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents
Date : 2013/09/17Source : SIGIR’13Authors : Zhu, Xingwei
Ming Zhao-YanZhu, XiaoyanChua, Tat-Seng
Advisor : Dr.Jia-ling, KohSpeaker : Wei, Chang
1
![Page 2: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/2.jpg)
Outline
• Introduction• Approach• Experiment• Conclusion
2
![Page 3: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/3.jpg)
IPhone 5s? IPhone 5c?
3
![Page 4: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/4.jpg)
Multi-Source User Generated Contents
4
![Page 5: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/5.jpg)
Problem Formulation
• Goal : Given a root topic C and its information source set Sc, we aim to build and continuously update a topic hierarchy H for C in order to organize the information in Sc according to their relevant topics.
• In this paper, Sc={Blogger, Twitter, community QA site(cQA)}
5
![Page 6: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/6.jpg)
Outline
• Introduction• Approach• Framework• Topic Term Identification• Topic Relation Identification• Topic Hierarchy Generation• Topic Hierarchy Update
• Experiment• Conclusion 6
![Page 7: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/7.jpg)
Framwork
7
![Page 8: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/8.jpg)
Topic Term Identification
8
User Generated Contents
Potential Grounding
Topics
Grounding Topic Set
Heuristic Rules
TF-IDFFinal
Candidate Topic SetExternal
Sources
![Page 9: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/9.jpg)
Heuristic Rules
9
![Page 10: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/10.jpg)
Grounding Topic Set
10
Apple Inc.
T-Mobile
IPhone
IOS
Price
64-bit
Smartphone
Blog 1
Tweet 2
QA 1
QA 2
Tweet 1
TFIDF
IPhoneApple Inc.
T-MobileApple Inc.
IOSApple Inc.
IOS
IPhone
AppleIOS
IPhone
![Page 11: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/11.jpg)
Grounding Topic Set
• Blogs • Use the content and title• Double weights of terms in titles• Use the top 5 terms
• cQAs :• Use the question title, description and the best
answers• Use the top 5 terms
• Tweets :• Use the content• Use the top 1 terms
11
![Page 12: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/12.jpg)
Topic Set Extension
• What we already have :• Grounding topic set
• What it lacks :• Middle level topic
• How to get middle level topics :• Search Engine : 2 patterns• * such as <slot>• <slot> of *
• WordNet : direct hypernym• Wikipedia : category tags
• Final candidate topic set : 12
![Page 13: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/13.jpg)
Outline
• Introduction• Approach• Framework• Topic Term Identification• Topic Relation Identification• Topic Hierarchy Generation• Topic Hierarchy Update
• Experiment• Conclusion 13
![Page 14: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/14.jpg)
Topic Relation Identification
14
IPhone IPhone 5s
Apple Inc.
𝑒(𝑟 (𝑡𝐴 , 𝑡𝐵)) 𝑒(𝑟 (𝑡𝐵 ,𝑡 𝐴))
𝑒(𝑟 (𝑡𝐶 ,𝑡𝐵))
𝑒(𝑟 (𝑡𝐴 , 𝑡𝐶 )) 𝑒(𝑟 (𝑡𝐶 ,𝑡 𝐴))
𝑒(𝑟 (𝑡𝐵 ,𝑡𝐶))Denote as a sub-topic relation, which means is a sub-topic of
![Page 15: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/15.jpg)
Topic Relation Identification
15
![Page 16: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/16.jpg)
Evidences from the Information Source Set• , : the cosine similarity between the corresponding contexts
of them• V=(smart phone, price, buy, iOS, Android)
16
![Page 17: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/17.jpg)
Evidences from Wikipedia
Pointwise Mutual Information (PMI)
17
![Page 18: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/18.jpg)
Evidences from WordNet
18
![Page 19: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/19.jpg)
Evidences from Search Engine Results• Pattern-based evidences• Query = “tA such as tB and” root topic• = 1 if the search engine returns more than ζ results that
contain this query; otherwise it is set to 0.
19
![Page 20: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/20.jpg)
Combine Evidences
20
![Page 21: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/21.jpg)
Outline
• Introduction• Approach• Framework• Topic Term Identification• Topic Relation Identification• Topic Hierarchy Generation• Topic Hierarchy Update
• Experiment• Conclusion 21
![Page 22: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/22.jpg)
Topic Hierarchy Generation
22
![Page 23: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/23.jpg)
Topic Hierarchy Generation
23
![Page 24: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/24.jpg)
Topic Hierarchy Generation
24
![Page 25: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/25.jpg)
Topic Hierarchy Generation
25
![Page 26: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/26.jpg)
Edge Weighting
26
![Page 27: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/27.jpg)
Hierarchy Pruning• Use the Chu- Liu/Edmond’s optimum branching algorithm• every non-root node has only one parent and the sum of the
edge weights are maximized• remove • (1) the nodes that are not reachable for the root topic and • (2) the leaf nodes that are not in the grounding topic set.
27
![Page 28: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/28.jpg)
Topic Hierarchy Update
28
![Page 29: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/29.jpg)
Outline
• Introduction• Approach• Framework• Topic Term Identification• Topic Relation Identification• Topic Hierarchy Generation• Topic Hierarchy Update
• Experiment• Conclusion 29
![Page 30: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/30.jpg)
Topic Term Identification
30
![Page 31: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/31.jpg)
Topic Hierarchy Generation
31
![Page 32: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/32.jpg)
Topic Hierarchy Generation
32
![Page 33: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/33.jpg)
Hierarchy Update
33
![Page 34: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/34.jpg)
Outline
• Introduction• Approach• Framework• Topic Term Identification• Topic Relation Identification• Topic Hierarchy Generation• Topic Hierarchy Update
• Experiment• Conclusion 34
![Page 35: Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816910550346895de028d6/html5/thumbnails/35.jpg)
Conclusion
• Given a root topic, we used evidences from multiple UGCs to identify topic terms and sub-topic relations between them. With these topic terms, a graph-based algorithm was applied to generate and update the topic hierarchies, on which the UGCs can be organized according to their relevant topics.
35