Song Liang - egh.phhp.ufl.edu file · Web viewSong Liang - egh.phhp.ufl.edu
Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold...
-
Upload
kevin-goodwin -
Category
Documents
-
view
217 -
download
0
description
Transcript of Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold...
![Page 1: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.](https://reader036.fdocuments.in/reader036/viewer/2022062412/5a4d1b177f8b9ab0599921ed/html5/thumbnails/1.jpg)
Sophia(Xueyao) LiangCPSC 503 Final Project
![Page 2: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.](https://reader036.fdocuments.in/reader036/viewer/2022062412/5a4d1b177f8b9ab0599921ed/html5/thumbnails/2.jpg)
K=3
Unsupervised
P( |d)P( |d)P( |d)
Olympic, vancouver
Snow, cold
Moon light, spider man
![Page 3: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.](https://reader036.fdocuments.in/reader036/viewer/2022062412/5a4d1b177f8b9ab0599921ed/html5/thumbnails/3.jpg)
W1 W2 W3 W4 …D1 1 0 1 1D2 … … … …D3 … … … …… … … … …
![Page 4: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.](https://reader036.fdocuments.in/reader036/viewer/2022062412/5a4d1b177f8b9ab0599921ed/html5/thumbnails/4.jpg)
W1 W2 W3 W4 …D1 1 0 1 1D2 … … … …D3 … … … …… … … … …
zk∈{z1,z2,…,zN}ln ( , )i j
i j
L p d w
( , ) ( ) ( | ) ( | )i j k i k j kk
p d w p z p d z p w z
![Page 5: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.](https://reader036.fdocuments.in/reader036/viewer/2022062412/5a4d1b177f8b9ab0599921ed/html5/thumbnails/5.jpg)
' ' ''
( ) ( | ) ( | )( | , )
( ) ( | ) ( | )k i k j k
k i jk i k j k
k
p z P d z p w zp z d w
p z p d z p w z
', '
( | , )( | )
( | , )
k i ji
j kk i j
i j
p z d wp w z
p z d w
'',
( | , )( | )
( | , )
k i jj
i kk i j
i j
p z d wp d z
p z d w
,
', , '
( | , )( )
( | , )
k i ji j
kk i j
i j k
p z d wp z
p z d w
Expectation:
Maximization:
![Page 6: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.](https://reader036.fdocuments.in/reader036/viewer/2022062412/5a4d1b177f8b9ab0599921ed/html5/thumbnails/6.jpg)
D1 D2 D3 D4 …D1 1 0 1 1D2 … … … …D3 … … … …… … … … …
( , )i jp d c( , )i jp d w
![Page 7: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.](https://reader036.fdocuments.in/reader036/viewer/2022062412/5a4d1b177f8b9ab0599921ed/html5/thumbnails/7.jpg)
W1 W2 W3 W4 …D1 1 0 1 1D2 … … … …D3 … … … …… … … … …
![Page 8: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.](https://reader036.fdocuments.in/reader036/viewer/2022062412/5a4d1b177f8b9ab0599921ed/html5/thumbnails/8.jpg)
![Page 9: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.](https://reader036.fdocuments.in/reader036/viewer/2022062412/5a4d1b177f8b9ab0599921ed/html5/thumbnails/9.jpg)
( , ) :i jw d d
' '1. ( , ) 1( ') ( , ) 0( ')i i i iw d d i i w d d i i
'| ( )| | ( )|
' m '1 1'
2. ( , ) (I ( ), ( ))| ( ) || ( ) |
i iI d I d
i i i n im ni i
Cw d d w d I dI d I d
![Page 10: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.](https://reader036.fdocuments.in/reader036/viewer/2022062412/5a4d1b177f8b9ab0599921ed/html5/thumbnails/10.jpg)
(1 )* *O L R 2
' ', '
( , ) ( ( | ) ( | )) ( ')i i k i k ii i k
R w d d p z d p z d i i
( | ) ( )* ( | )k i k i kp z d p z p d z
![Page 11: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.](https://reader036.fdocuments.in/reader036/viewer/2022062412/5a4d1b177f8b9ab0599921ed/html5/thumbnails/11.jpg)
Efficient Algorithm:Expectation (PLSA)Maximization(PLSA)The result of the previous steps may not
ends in better value for O
Parameter Inference: No closed form solution for expectation step
' ''
''
( , ) ( | )( | ) (1 ) ( | )
( , )
i i i ki
i k i ki i
i
w d d p d zp d z p d z
w d d
![Page 12: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.](https://reader036.fdocuments.in/reader036/viewer/2022062412/5a4d1b177f8b9ab0599921ed/html5/thumbnails/12.jpg)
Potential Problems of the model
Parameter InferenceHigher time complexity and slower to converge
(1 )* *O L R
-10000
100
![Page 13: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.](https://reader036.fdocuments.in/reader036/viewer/2022062412/5a4d1b177f8b9ab0599921ed/html5/thumbnails/13.jpg)
Cora Data version 1.0
Cited paper not in the corpusNo abstract for some post-script files
Too many categoriesDuplicated or isolated papers
30000 scientific papers, with citation informationImportant files: papers (ID-name, link, author…..) citations (ID-cited ID) classifications (link-category) directory: extractions (post-script form of the papers)
![Page 14: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.](https://reader036.fdocuments.in/reader036/viewer/2022062412/5a4d1b177f8b9ab0599921ed/html5/thumbnails/14.jpg)
Cora Data version 1.0Papers in category Machine LearningAbout 2700 papers1400 Frequent Words (stop words removed, stemmed)Theory 315
Reinforcement 217Geneti Algorithms 418Neural Networks 818Probabilistic 426Case based 298Rule Learning 180
![Page 15: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.](https://reader036.fdocuments.in/reader036/viewer/2022062412/5a4d1b177f8b9ab0599921ed/html5/thumbnails/15.jpg)
arg max ( | )kk
p z d
![Page 16: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.](https://reader036.fdocuments.in/reader036/viewer/2022062412/5a4d1b177f8b9ab0599921ed/html5/thumbnails/16.jpg)
(A) Accuracy
(B) RecallAccuray and Recall for each category
PHITS PLSA NetPLSA Overall Accuracy
0.470 0.501 0.562
Overall Accuracy
![Page 17: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.](https://reader036.fdocuments.in/reader036/viewer/2022062412/5a4d1b177f8b9ab0599921ed/html5/thumbnails/17.jpg)
Justified the claim that adding network structure into the model could improve the result of topic modeling
Modeled the network on a scale of articles
Inherent problem exists in the picked framework
The result is still far from satisfactory
![Page 18: Sophia(Xueyao) Liang CPSC 503 Final Project. K=3 Unsupervised P( |d) Olympic, vancouver Snow, cold Moon light, spider man.](https://reader036.fdocuments.in/reader036/viewer/2022062412/5a4d1b177f8b9ab0599921ed/html5/thumbnails/18.jpg)
How to model the network structure of blog articles, especially considering model them on a scale of articles
Bag-of-words matrix extraction Better integral model, maybe LDA
based Efficiency of the algorithm Recommendation based on topic
communtiy discovery