Linked data at globo.com - Web of Linked Entities (WoLE 2013) - WWW 2013

Post on 15-Jul-2015

2.234 views 1 download

Tags:

Transcript of Linked data at globo.com - Web of Linked Entities (WoLE 2013) - WWW 2013

Linked Data at globo.com

Semantic Teamsemantica@corp.globo.com

Ícaro Medeirosicaro.medeiros@gmail.com

globo.comWeb of Linked Entities 2013WWW 2013

Monday, May 13, 13

Who we are

Monday, May 13, 13

BROADCAST MOVIES PAY TV INTERNET

EVENTS MUSIC

PUBLISHING

NEW VENTURES NEWSPAPERRADIO NETWORK

Monday, May 13, 13

31.4MM unique visitors/month*

* source: Ibope, 04/13

globo.com

Monday, May 13, 13

Linked Data atglobo.com

Monday, May 13, 13

Semantic Webteam

Monday, May 13, 13

Organize and distribute all content produced by Organizações Globo

Mission

Monday, May 13, 13

Ontologies

Monday, May 13, 13

Base

Current scenario

news sports gossip tv

Monday, May 13, 13

Upper

Person Place Organization

Music

Politics

Programme Education

Sports

Future scenario

Monday, May 13, 13

Annotation tool

Monday, May 13, 13

Technologies

Monday, May 13, 13

Embedded into our existing CMSs

Web CMS

Video publishing system (developed in-house)

Common UX for content producers

Interface adapts itself to ontology

Annotations stored in Virtuoso triple store

Monday, May 13, 13

Monday, May 13, 13

Interface follows the ontology

Fields

Search ranges

Suggest as you type

Automatic entity extraction

Monday, May 13, 13

Automatic pagegeneration

Monday, May 13, 13

globoesporte.com

Monday, May 13, 13

globoesporte.com

Monday, May 13, 13

globoesporte.com

Monday, May 13, 13

globoesporte.com

Monday, May 13, 13

globoesporte.com

Monday, May 13, 13

globoesporte.com

Monday, May 13, 13

Brainiak:Linked data RESTful API

Monday, May 13, 13

triple store

Legacy architecture

process queue

suggest

annotation

entity extractor

search engineAPI

suggest

annotation

entity extractor

entity extractor

suggest

annotation

annotation

suggest entity extractor

CMS

CDA

Monday, May 13, 13

New architecture

triple store

search engine

BRAINIAK

API

Monday, May 13, 13

Linked data made simple!

Goal

Monday, May 13, 13

Authorization and authentication

Reduce need of writing SPARQL queries

Single point of access to the triplestore

Requirements

Data management quality (deduplication, validation, integration with external datasets, etc)

Monday, May 13, 13

RESTful hypermedia API

Requirements

Performance enhancement - caching

Monday, May 13, 13

Technologies

Monday, May 13, 13

Main concepts

Monday, May 13, 13

ContextIsolated dataspace (graph)

Sports

Context

Monday, May 13, 13

SchemaDefinition of a data type(i.e. definition of a class)

Sports

Team

Context

Schema

Monday, May 13, 13

Sports

Team

Schema

Instance

Collection

Barcelona

Instance

CollectionTeam

Context

Monday, May 13, 13

Hypermedia API

Monday, May 13, 13

Content negotiation

Decoupling server and client side

Access Simplicity (don’t remember URLs, navigate)

Hypermedia

Application is treated as a state machine

Monday, May 13, 13

self

/sports/Team/Barcelona

/sports/Team

inCollection

item

createdeletereplace

Relations

Monday, May 13, 13

Services

Monday, May 13, 13

Context List contexts

Collections List collections (of a context)

Services

Schemas Retrieve schema of a class (collection)

Monday, May 13, 13

List instances (same type)

Retrieve instance

Instances

Create instance

Update instance

Delete instanceMonday, May 13, 13

{ "@context": { "rdfs": "http://www.w3.org/2000/01/rdf-schema#" "sports": "http://semantica.globo.com/sports/", "upper": "http://semantica.globo.com/upper/", }, "@id": "http://semantica.globo.com/sports/Team/Barcelona", "@type": "sports:Team", "rdfs:label": "Barcelona", "upper:fullName": "Futbol Club Barcelona", "upper:acronym": "BARCELONA"}

/sports/Team/BarcelonaGET

Monday, May 13, 13

Filtering instances

Monday, May 13, 13

Filtering instances

SELECT * FROM <http://semantica.globo.com/sports/>WHERE { ?s a sports:Team . ?s rdfs:label “Barcelona” .}

Monday, May 13, 13

Filtering instances

Monday, May 13, 13

/sports/Team?p=rdfs:label&?o=Barcelona

Filtering instances

Monday, May 13, 13

Dealing with legacy and external graphs

/sports/&graph_uri=dbpedia:sports

&class_uri=dbpedia-ont:Team

Monday, May 13, 13

Brainiak will be open sourced next Monday at

1st Globo Semantic Day

Join now for free:

bit.ly/semantic_day_globo

Follow us on github:github.com/globocom

Monday, May 13, 13

How we seethe future

Monday, May 13, 13

Inference-based navigation

SEO (automatic schema.org)

Richer content (e.g. timelines), frequent automatic updates

Better annotation suggestion (DBpedia Spotlight)

Linked with open data (DBPedia, dados.gov.br)

Monday, May 13, 13

THANK YOU

Semantic Teamsemantica@corp.globo.com

Ícaro Medeirosicaro.medeiros@gmail.com

globo.com

Monday, May 13, 13