JIST 2012
-
Upload
dongpo-deng -
Category
Documents
-
view
575 -
download
4
Transcript of JIST 2012
![Page 1: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/1.jpg)
Utilizing Linked Open Data (LOD) Resources for
Semantic Enhancement of User-Generated Content
Dong-Po Deng1,2, Guan-Shuo Mai3, Cheng-Hsin Hsu3, Chin-Lung Chang1,4, Tyng-Ruey Chuang1, and Kwang-Tsao Shao3
1ITC, University of Twente, Enschede, the Netherlands
2Institute of Information Science & 3Biodiversity Research Center,Academia Sinica, Taipei, Taiwan
4Department of Computer Science and Information EngineeringNational Taiwan University of Science and Technology
Taipei, Taiwan
Thursday, February 7, 2013
![Page 2: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/2.jpg)
2012/12/3JIST2012
Outline
BackgroundMotivationData CollectionLOD resources - LODE and LOD TGNAn approach for processing UGC Information Extraction Information Formalization Information Reuse
Conclusion remarking
2
Thursday, February 7, 2013
![Page 3: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/3.jpg)
2012/12/3JIST2012
Outline
BackgroundMotivationData CollectionLOD resources - LODE and LOD TGNAn approach for processing UGC Information Extraction Information Formalization Information Reuse
Conclusion remarking
3
Thursday, February 7, 2013
![Page 4: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/4.jpg)
2012/12/3JIST2012
Background
Web 2.0 technologies enable people to contribute their content on the web, e.g. wiki, blog, tagging
Social media utilize web 2.0 technologies to support social interactive on the web, e.g. twitter, flickr, facebook
The content on the web (or/and social media) contributed by people is called “User-Generated Content” (UGC)
UGC is mainly multimedia or textual dataUGC is considered as a potential resource for scientific projects, e.g. citizen science
4
Thursday, February 7, 2013
![Page 5: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/5.jpg)
2012/12/3JIST2012
Background(cont.)
There are several problems to harvest UGC to scientific purposes The unstructured UGC is difficult to handle The semantics of UGC is often ambiguous or/and poor Social media is not designed for scientific purposes
5
Courtesy from http://www.datenform.de/mapeng.html
Thursday, February 7, 2013
![Page 6: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/6.jpg)
2012/12/3JIST2012
Outline
BackgroundMotivationData CollectionLOD resources - LODE and LOD TGNAn approach for processing UGC Information Extraction Information Formalization Information Reuse
Conclusion remarking
6
Thursday, February 7, 2013
![Page 7: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/7.jpg)
2012/12/3JIST2012
Motivation
LOD datasets as resources LOD aims on how to make data available on the Web, and to interconnect data with the aim of increasing its value for users
about 300 datasets consisting of over 31 billion RDF triples within LOD projects.
Each entry representing a fact in LOD datasets has a Unique Resource Identifier (URI) which is referenceable and linkable on the Web.
The high interconnectivity between entries potentially increases discoverability, reusability, and the utility of information
7
Thursday, February 7, 2013
![Page 8: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/8.jpg)
2012/12/3JIST2012
Motivation (cont.)
Therefore, if named entities of UGC can be identified and connected to entries of LOD, the semantics of named entities would be disambiguated, so that the UGC could be easier to process.
8
Thursday, February 7, 2013
![Page 9: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/9.jpg)
2012/12/3JIST2012
Outline
BackgroundMotivationData CollectionLOD resources - LODE and LOD TGNAn approach for processing UGC Information Extraction Information Formalization Information Reuse
Conclusion remarking
9
Thursday, February 7, 2013
![Page 10: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/10.jpg)
2012/12/3JIST2012
Data collection
Two Facebook interest groups for ecological observations in Taiwan
10
http://www.facebook.com/groups/roadkilled/ http://www.facebook.com/groups/enjoymoths/
Thursday, February 7, 2013
![Page 11: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/11.jpg)
2012/12/3JIST2012
Ecological Observations on Facebook
11
Thursday, February 7, 2013
![Page 12: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/12.jpg)
2012/12/3JIST2012
Outline
BackgroundMotivationData CollectionLOD resources - LODE and LOD TGNAn approach for processing UGC Information Extraction Information Formalization Information Reuse
Conclusion remarking
12
Thursday, February 7, 2013
![Page 13: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/13.jpg)
2012/12/3JIST2012
LOD Ecology
Linked Open Data of Ecology (LODE) is a validated dataset from a LOD project.
LODE integrated 5 previously distributed databases:
13
TFRI: Taiwan Forestry Research Institute
Thursday, February 7, 2013
![Page 14: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/14.jpg)
2012/12/3JIST2012
LODE in Linked Open Data Cloud
14
Thursday, February 7, 2013
![Page 15: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/15.jpg)
2012/12/3JIST2012
LODE in Linked Open Data Cloud
14
Thursday, February 7, 2013
![Page 16: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/16.jpg)
2012/12/3JIST2012
LOD Taiwan Geographic Name (TGN)
LOD TGN is mainly transferred from Taiwan Gazetteer via LOD principles
LOD TGN has 159,241 geographic name entries, in which 17,442 entries are linked to geonames.org
15
Thursday, February 7, 2013
![Page 17: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/17.jpg)
2012/12/3JIST2012
Outline
BackgroundMotivationData CollectionLOD resources - LODE and LOD TGNAn approach for processing UGC Information Extraction Information Formalization Information Reuse
Conclusion remarking
16
Thursday, February 7, 2013
![Page 18: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/18.jpg)
2012/12/3JIST2012
Information Extraction Information Reuse
Information Formalization
An approach for processing UGC
17
Thursday, February 7, 2013
![Page 19: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/19.jpg)
2012/12/3JIST2012
Outline
BackgroundMotivationData CollectionLOD resources - LODE and LOD TGNAn approach for processing UGC Information Extraction Information Formalization Information Reuse
Conclusion remarking
18
Thursday, February 7, 2013
![Page 20: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/20.jpg)
2012/12/3JIST2012
Problems in Chinese species names in Facebook ecological observations
玉帶鳳蝶 (Papilio Polytes)
曙鳳蝶 (Atrophaneura Horishana)
琉璃紋鳳蝶 (Papilio Hermosanus)
曙鳳
玉帶
琉璃Adjective Noun
細紋 (pronounced Si-Wen, meaning “fine veined”
細紋新蠍蛉細紋蠍蛉細紋黃鉤蛾
...15 species names with prefix name “細紋”
(1)
(2)
19
Thursday, February 7, 2013
![Page 21: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/21.jpg)
2012/12/3JIST2012
Confidence value =
Identifying shortened species names
20
Thursday, February 7, 2013
![Page 22: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/22.jpg)
2012/12/3JIST2012
Determine a species name for a thread
What if several species names had mentioned in one thread? We used three criteria How many Like does the post or the comments get?
How prestigious are the people who post or make comments?
How many times does a species name occur in a thread?
21
Thursday, February 7, 2013
![Page 23: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/23.jpg)
2012/12/3JIST2012
The problems of geographic names in Facebook ecological observations
特生中心
特有生物研究保育中心Te-You-Sheng-Wu-Yan-Jiou-Bao-Yu-Jhong-Sin
An example:The Endemic Species Research Institute
Te-Sheng-Jhong-Sin
is shorten to
22
Thursday, February 7, 2013
![Page 24: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/24.jpg)
2012/12/3JIST2012
The problems of geographic names in Facebook ecological observations
特生中心
特有生物研究保育中心Te-You-Sheng-Wu-Yan-Jiou-Bao-Yu-Jhong-Sin
An example:The Endemic Species Research Institute
Te-Sheng-Jhong-Sin
is shorten to
There are no rules to shorten long geographic names
22
Thursday, February 7, 2013
![Page 25: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/25.jpg)
2012/12/3JIST2012
Identifying shortened geographic names
23
Thursday, February 7, 2013
![Page 26: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/26.jpg)
2012/12/3JIST2012
The ontology...
is relied on a Facebook thread, which is an entity comprised of social media contents involving peoples, places, time periods, photos, and links to other contents
uses standard vocabularies, Semantically-Interlinked Online communities (SIOC) can be used to represent the structure of Facebook posts, comments, and threads.
Friend of a Friend (FOAF) can be used to describe content creators,
and Dublin Core for the interlinked contents they created
24
Thursday, February 7, 2013
![Page 27: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/27.jpg)
2012/12/3JIST2012
An ontology for formalizing the extractedinformation from Facebook threads
25
Thursday, February 7, 2013
![Page 28: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/28.jpg)
2012/12/3JIST2012
Outline
BackgroundMotivationData CollectionLOD resources - LODE and LOD TGNAn approach for processing UGC Information Extraction Information Formalization Information Reuse
Conclusion remarking
26
Thursday, February 7, 2013
![Page 29: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/29.jpg)
2012/12/3JIST2012
http://140.109.28.64:2020/page/thread/177883715557195_440860179259546
Transfer ecological observations in Facebook to RDF
27
Thursday, February 7, 2013
![Page 30: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/30.jpg)
2012/12/3JIST2012
http://140.109.28.64:2020/page/thread/177883715557195_440860179259546
Transfer ecological observations in Facebook to RDF
27
Thursday, February 7, 2013
![Page 31: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/31.jpg)
2012/12/3JIST2012
The extracted species name from the Facebook thread is linked to LOD resources
28
Thursday, February 7, 2013
![Page 32: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/32.jpg)
2012/12/3JIST2012
The extracted species name from the Facebook thread is linked to LOD resources
28
Thursday, February 7, 2013
![Page 33: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/33.jpg)
2012/12/3JIST2012
The extracted species name from the Facebook thread is linked to LOD resources
28
Thursday, February 7, 2013
![Page 34: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/34.jpg)
2012/12/3JIST2012
The extracted species name from the Facebook thread is linked to LOD resources
28
Thursday, February 7, 2013
![Page 35: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/35.jpg)
2012/12/3JIST2012
A taxon of Theretra Nessus is the extracted species name
29
Thursday, February 7, 2013
![Page 36: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/36.jpg)
2012/12/3JIST2012
A taxon of Theretra Nessus is the extracted species name
29
This entry is connected to LODE via owl:sameAs
Thursday, February 7, 2013
![Page 37: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/37.jpg)
2012/12/3JIST2012
The extracted place name from the Facebook thread is linked to LOD resources
30
Thursday, February 7, 2013
![Page 38: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/38.jpg)
2012/12/3JIST2012
The extracted place name from the Facebook thread is linked to LOD resources
30
Thursday, February 7, 2013
![Page 39: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/39.jpg)
2012/12/3JIST2012
The extracted place name from the Facebook thread is linked to LOD resources
30
Thursday, February 7, 2013
![Page 40: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/40.jpg)
2012/12/3JIST2012
The extracted place name from the Facebook thread is linked to LOD resources
30
Thursday, February 7, 2013
![Page 41: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/41.jpg)
2012/12/3JIST2012
The entry of LOD TGN transferred from Taiwan Gazetteer
31
Thursday, February 7, 2013
![Page 42: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/42.jpg)
2012/12/3JIST2012
The entry of LOD TGN transferred from Taiwan Gazetteer
31
It is linked to geonames.org via owl:sameAs
Thursday, February 7, 2013
![Page 43: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/43.jpg)
2012/12/3JIST2012
Publish the processed Facebook ecological observations
32
Thursday, February 7, 2013
![Page 44: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/44.jpg)
2012/12/3JIST2012
Outline
BackgroundMotivationData CollectionLOD resources - LODE and LOD TGNAn approach for processing UGC Information Extraction Information Formalization Information Reuse
Conclusion remarking
33
Thursday, February 7, 2013
![Page 45: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/45.jpg)
2012/12/3JIST2012
A semantic annotation plug-in for entering geographic names in Facebook posts
34
Thursday, February 7, 2013
![Page 46: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/46.jpg)
2012/12/3JIST2012
A semantic annotation plug-in for entering geographic names in Facebook posts
34
Thursday, February 7, 2013
![Page 47: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/47.jpg)
2012/12/3JIST2012
A semantic annotation plug-in for entering geographic names in Facebook posts
34
Thursday, February 7, 2013
![Page 48: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/48.jpg)
2012/12/3JIST2012 35
Thursday, February 7, 2013
![Page 49: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/49.jpg)
2012/12/3JIST2012
Outline
BackgroundMotivationData CollectionLOD resources - LODE and LOD TGNAn approach for processing UGC Information Extraction Information Formalization Information Reuse
Conclusion remarking
36
Thursday, February 7, 2013
![Page 50: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/50.jpg)
2012/12/3JIST2012
Conclusion remarking
This study reports our experiences in transferring FB ecological observations to interlink to LOD resources (LODE and LOD TGN)
With these information extraction tools and LOD resources, we developed a tool for semantic enhancement of user input.
The LOD TGN is an ongoing project. In the future, we will consolidate the feature types of the geographic names, and we plan to make the LOD TGN a geospatial semantics reference resource.
37
Thursday, February 7, 2013
![Page 51: JIST 2012](https://reader033.fdocuments.in/reader033/viewer/2022052823/555115dbb4c905f10b8b4e25/html5/thumbnails/51.jpg)
2012/12/3JIST2012
Thank you for your attentions
Questions?
38
Thursday, February 7, 2013