Extraction of topic evolutions from references in scientific articles and its GPU acceleration

9
Extraction of Topic Evolutions from References in Scientific Articles and Its GPU Acceleration Tomonari MASADA Nagasaki Univ. Masada and Takasu @ CIKM 2012 Atsuhiro TAKASU NII

description

http://dx.doi.org/10.1145/2396761.2398465

Transcript of Extraction of topic evolutions from references in scientific articles and its GPU acceleration

Page 1: Extraction of topic evolutions from references in scientific articles and its GPU acceleration

Extraction ofTopic Evolutions from

References inScientific Articles andIts GPU Acceleration

Tomonari MASADA

Nagasaki Univ.

Masada and Takasu @ CIKM 2012

Atsuhiro TAKASU

NII

Page 2: Extraction of topic evolutions from references in scientific articles and its GPU acceleration

Problem & Solution

Extract topic evolutionsfrom linked documents.

Modify LDA by introducinga transition probability matrix.

Masada and Takasu @ CIKM 2012

Page 3: Extraction of topic evolutions from references in scientific articles and its GPU acceleration

Modeling topic evolutions

Draw topics from the following distribution:

Masada and Takasu @ CIKM 2012

transitionmatrix

θd +θd θd+ θd+ θd+…+

topic distribution of citing document

topic distribution of citing document

topic distributions of cited documents

topic distributions of cited documents

(Apply the same matrix

for all citing relations.)

(Apply the same matrix

for all citing relations.)

#(cited documents)

・t(1-t)・ ・

Page 4: Extraction of topic evolutions from references in scientific articles and its GPU acceleration

"TERESA"Our method "TERESA" extracts

Topic Evolutions from REferences

in Scientific Articles.

We utilize document links to reveal

directed relationship among topics,

not among documents.Masada and Takasu @ CIKM 2012

Page 5: Extraction of topic evolutions from references in scientific articles and its GPU acceleration

Preceding worksDirected relationship from a time point to another

[Ren+ ICML08]

Directed relationship from a document to another[Dietz+ ICML07][Nallapati+ ICWSM08][Nallapati+ AISTATS11]

Corpus-wide undirected relationship among topics[Sun+ ICDM09]

Corpus-wide directed relationship among topicsTERESA

Masada and Takasu @ CIKM 2012

Page 6: Extraction of topic evolutions from references in scientific articles and its GPU acceleration

GPU Acceleration

Variational Bayesian Inference (VB)embarrassingly parallel [Zhai+ WWW12]

Time complexityO(MK2) for TERESA (cf. O(MK) for LDA)

•K: # topics

•M: # unique doc-word pairs

Masada and Takasu @ CIKM 2012

Page 7: Extraction of topic evolutions from references in scientific articles and its GPU acceleration

ExperimentCora dataset (umass)

Masada and Takasu @ CIKM 2012

Page 8: Extraction of topic evolutions from references in scientific articles and its GPU acceleration

ExperimentCora dataset (umass)

Masada and Takasu @ CIKM 2012

word sense disambiguationword sense disambiguation

taggingtagging

Gaussian mixtureGaussian mixture

speech recognitionspeech recognition

domain knowledge extractiondomain knowledge extraction

machine translationmachine translationDNA sequence alignmentDNA sequence alignment

MCMCMCMC

time series analysistime series analysis

parsingparsing

Bayesian networkBayesian network

neural networkneural network

IRIR

semantic analysissemantic analysis

SQLSQLinformation integrationinformation integration

join queryjoin query

PAC learningPAC learning

circuit lower boundcircuit lower bound

Page 9: Extraction of topic evolutions from references in scientific articles and its GPU acceleration

Future workDevise an inferencewith less approximation.

Apply TERESAto SNS linked documents.

Implementtopic evolution browser.

Masada and Takasu @ CIKM 2012