@twitter Mining #Microblogs Using #Semantic Technologies

23
@twitter Mining #Microblogs Using #Semantic Technologies Selver Softic, Martin Ebner, Herbert Mühlburger , Thomas Altmann, Behnam Taraghi

description

Presenation of Selver Softic at 6th Workshop on Semantic Web Applications and Perspectives (SWAP 2010)

Transcript of @twitter Mining #Microblogs Using #Semantic Technologies

Page 1: @twitter Mining #Microblogs Using #Semantic Technologies

@twitter Mining #Microblogs Using #SemanticTechnologies

Selver Softic, Martin Ebner, Herbert Mühlburger , Thomas Altmann,

Behnam Taraghi

Page 2: @twitter Mining #Microblogs Using #Semantic Technologies

Web 2.0 - well known story

Web 2.0 technologies brought users closer to Web …– Wikis, Blogs, Forums …– Podcasts, RSS, XML …

… then users started to generate content …

Source: http:mediabistro.com

Page 3: @twitter Mining #Microblogs Using #Semantic Technologies

From Web to Social Web

• Result = a vast of information– Text, Pictures, Audio, Videos ….

• Communication, networking, exchange of data• Web became more personal• Cultural, geographical and social borders

disappeared

Source: http://www.ignitesocialmedia.com

Page 4: @twitter Mining #Microblogs Using #Semantic Technologies

Social Media Boom!

Page 5: @twitter Mining #Microblogs Using #Semantic Technologies
Page 6: @twitter Mining #Microblogs Using #Semantic Technologies

Social sites are data silos

source: www.pidgintech.com

Page 7: @twitter Mining #Microblogs Using #Semantic Technologies

But still disconnected ?

source: www.pidgintech.com

Page 8: @twitter Mining #Microblogs Using #Semantic Technologies

Data is still captured in Walled Garden!

Page 9: @twitter Mining #Microblogs Using #Semantic Technologies

Statements

• Social Web relies on users and communication among them

• While communicating users produce or consume content

• Social sites are data silos rich on variety of information

• This information could be interesting for:– monitoring of trends, advertising, statistics, reputation,

news broadcasting , tagging …• This data is captured in Walled garden !!!

Page 10: @twitter Mining #Microblogs Using #Semantic Technologies

Questions

• How to use this data to gain more useful insights• What are the advantages of online (offline) search

on such data and how to reach it in an uniform way

• Is it possible to structurize, connect and expose the data in order to be used by humans and machines more efficiently

• What would an architecture look like for this issue

Page 11: @twitter Mining #Microblogs Using #Semantic Technologies

Social Web Trends

MicrobloggingSocial BookmarkingSocial NetworkingSocial MarketingSharing Photos, Videos …

Source: http://socialwebresearch.com

Page 12: @twitter Mining #Microblogs Using #Semantic Technologies

Microblogs• Microblogs

– Used for communication,publishing and information exchange– Simple for processing – Information generated by many different users– Social user relations– Tripartite communication structure– Variety of informations – No boundaries by culture, location or technology (mobile users)

• Twitter– Most Popular – Large amount od data– But limitedAccording: http://an.kaist.ac.kr/traces/WWW2010.html41.7 million user profiles, 1.47 billion social relations, 4,262 trending topics, and 106

million tweets

Page 13: @twitter Mining #Microblogs Using #Semantic Technologies

Semantic aspects and Twitter

• Twitter– User realtions– Tweets as short information artefacts – Communication with tripartite pattern– Time related information

• Vocabularies– SIOC, FOAF, Dublin Core

Page 14: @twitter Mining #Microblogs Using #Semantic Technologies

Linked Data and Twitter

• Twitter contains infos on:– People, Organisations,

Locations, Trends …

• LOD Cloud contains– Billions of triples about:

• Geolocations , data about science, government, common knowledge , persons, news …

• Vocabularies– MOAT, CommmonTag

Page 15: @twitter Mining #Microblogs Using #Semantic Technologies

Architecture model

Page 16: @twitter Mining #Microblogs Using #Semantic Technologies

Acquisition - Grabeeter

Page 17: @twitter Mining #Microblogs Using #Semantic Technologies

Grabeeter

• Search in your Tweets• Filter your Tweets by date• Search in your Tweets offline using the

Grabeeter Client• Filter your tweets offline using the Grabeeter

Client• Grabeeter provides an API

Page 18: @twitter Mining #Microblogs Using #Semantic Technologies

Triplification Module

• Author• Date• Content• Reciever

<tweet url="http://grabeeter.tugraz.at/tweet/199272" text="Sitting in Prater #vienna, launch party. Nice" screen_name="selvers" created="2010-08-19" twitterUrl="http://twitter.com/selvers/status/21606926237"/>

TriplifierRDF Store

Page 19: @twitter Mining #Microblogs Using #Semantic Technologies

Triplification Module@prefix foaf: <http://xmlns.com/foaf/0.1/> .

@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

@prefix sioc: <http://rdfs.org/sioc/ns#> .

@prefix sioct: <http://rdfs.org/sioc/types#> .

@prefix dcterms: <http://purl.org/dc/terms/#> .

<http://twitter.com/selvers/status/21606926237> rdf:type sioct:MicroblogPost ;

sioc:content "Sitting in Prater #vienna, launch party. Nice" ;

sioc:has_creator <http://twitter.com/selvers/> ;

foaf:maker <http://grabeteer.tugraz.at/foaf/selvers/> ;

dcterms:created “2010-08-19” ; rdfs:sameAs <http://grabeeter.tugraz.at/tweet/199272> .

<http://twitter.com/selvers/> rdf:type foaf:Person ;

foaf:name "Selver Softic" ;

foaf:depiction <http://a0.twimg.com/profile_images/905118560/f9e4b6eba.13070201_3_normal.jpg> ;

foaf:knows <http://twitter.com/hmuehlburger/> ;

foaf:knows <http://twitter.com/mhausenblas/> ;

foaf:knows <http://twitter.com/mebner/> .

Page 20: @twitter Mining #Microblogs Using #Semantic Technologies

Interlinking Module

• Hashtags (People, Organisation, Locations)• MOAT, CommonTag• Later NLP processed content, SILK Framework

SELECT ?post ?content ?maker ?name WHERE {?post rdf:type sioct:MicroblogPost; foaf:maker ?maker; ?maker foaf:name ?name;sioc:content ?content.FILTER(regex(?content,#vienna))}

tag: tagName "vienna" ;moat: tagMeaning <http://dbpedia .org/resource/Vienna>tag: taggedResource <http://twitter.com/selvers/status/2160692623>

Classifier

Page 21: @twitter Mining #Microblogs Using #Semantic Technologies

Analysis

Page 22: @twitter Mining #Microblogs Using #Semantic Technologies

Conclusions & Outlook

• Current state of the art technologies suffice to realise the proposed architecture paradigm

• Interlinking with LOD Cloud (Tweet-O-Sphere)• Involving NLP Methods• Sentiment classification• (Re)Tagging of Tweets• Providing SPARQL Endpoint + Lookup Service as

research interface• Social Semantic Web Apps

Page 23: @twitter Mining #Microblogs Using #Semantic Technologies

Questions?