A Hierarchical Monothetic Document Clustering Algorithm for Summarization and Browsing Search...

A Hierarchical Monothetic Document Clustering Algorithm for Summarization and Browsing Search Results

Kummamuru et al.

Presented by Bei YuSept. 22nd, 2004

Roadmap

Properties of topic hierarchy Automatic taxonomy generation (ATG) Monothetic ATG DisCover algorithm CAARD algorithm DSP algorithm Result comparison Questions

Generating Topic Hierarchy (Taxonomy)

Desirable properties of topic hierarchy document coverage Compactness (breadth/depth, node numb

er) Sibling node distinctiveness Node label predictiveness General to specific Reach time

Monothetic ATG

Automatic Taxonomy Generation (ATG) monothetic vs. polythetic

Monothetic: single-feature based cluster assignment Polythetic: multiple-features based assignment

Keywords vs. documents vs. both clustering Top-down vs. bottom-up

Monothetic ATG Subsumption algorithm (Sanderson and Croft, 1999) DSP (Lawrie et al., 2001) CAARD (Kummamuru and Krishnapuram, 2001) DisCover (this paper)

DisCover

Progressively grow the hierarchy Coverage and compactness tradeoff Generate an optimal permuted sequence of the

concepts under a node. Every document represented as a set of concepts; “concepts under the node” means all the the

other concepts in the documents covered by the node.

Select an optimal subset from the concepts with maximal coverage and distinctiveness

Question: preset the child node number?

DisCover

|)()(|),(

),(2),(1),(

1,1,1,

1, 1,maxarg

jkdjkcjk

StctcSg

SdcdcSg

cSgwcSgwcSg

UccSgkk

Coveragedistinctiveness

CAARD (Kummamuru and Krishnapuram, 2001)

corpus

concepts

Inclusion Degree:||/|| iijij wwwID

top-level Min_subset

Rest subset

recursive

DSP (Lawrie et al., 2001)

corpus

Topic terms

top-level topic terms

Vocabulary terms

Maximal predictive power and vocabulary coverage

Language modelA: topic term; B: vocabularyA=B

RecursionA <- subtopic term around topicB=A?

)|(Pr BAx

Evaluation

In general Precision F-measure User study Summary evaluation (EMIM cmp. TF*IDF) Reachability Reach time

This paper compares Computation complexity Coverage and compactness Reach time User study

Results

Questions

The performance as the number of nodes even increase (greater than 9) ?

How to exactly map the concept sequence to the tree structure?

A Hierarchical Monothetic Document Clustering Algorithm for Summarization and Browsing Search...

Documents

Transcript of A Hierarchical Monothetic Document Clustering Algorithm for Summarization and Browsing Search...

User-Sensitive Text Summarization: Application to the ... · User-Sensitive Text Summarization: Application to the Medical ... User-Sensitive Text Summarization: Application ... rization

Video Summarisation for Surveillance and News Domain · Video summarization approaches have various fields of application, specifically related to organizing, browsing and accessing

Seeing the Whole in Parts: Text Summarization for Web ...javed/DL/web/p594-buyukk.pdf · Seeing the Whole in Parts: Text Summarization for Web Browsing on Handheld Devices Orkut Buyukkokten

Road to Summarization

Video Co-summarization: Video Summarization by …...Video Co-summarization: Video Summarization by Visual Co-occurrence Wen-Sheng Chu1 Yale Song2 Alejandro Jaimes2 1Robotics Institute,

Guided Summarization

Visualization & Summarization

Efficient Web Browsing on Handheld Devices Using Page and Form Summarization Orkut Buyukkokten, Oliver Kaljuvee, Hector Garcia-Molina, Andreas Paepcke.

Auto summarization tool

User-Sensitive Summarization Thesis Proposal - Peoplepeople.dbmi.columbia.edu/noemie/papers/proposal.pdf · User-Sensitive Summarization Thesis Proposal ... text summarization are

Abstractive Review Summarization

Seeing the Whole in Parts: Text Summarization for Web Browsing on … · 2001. 3. 23. · Seeing the Whole in Parts: Text Summarization for Web Browsing on Handheld Devices Orkut

Cisco - OSPF Design Guidefaculty.weber.edu/kcuddeback/Common_Items/OSPF Configuration.pdf · OSPF and Route Summarization Inter−Area Route Summarization External Route Summarization

1 Today Tools (Yves) Efficient Web Browsing on Hand Held Devices (Shrenik) Web Page Summarization using Click- through Data (Kathy) On the Summarization.

Scene Summarization

Monothetic divisive clustering with geographical constraints

GraSS: Graph Structure Summarization - Computer … · GraSS: Graph Structure Summarization ... summarization process leaves the database owner with the ... ﬁnding a good graph

Video Summarization using Deep Semantic Featuresyokoya.naist.jp › paper › datas › 1469 › video-summarization-deep.pdf · Video Summarization using Deep Semantic Features Mayu

Automatic Text Summarization

Patent Summarization and Paraphrasing - Electrical …ece.drexel.edu/walsh/David_PatentSummarization.pdfPatent Summarization I Patent Summarization is the technique of summarizing