Semantic Technology for Development: Semantic Web without the Web?

Post on 21-Jan-2018

385 views 5 download

Transcript of Semantic Technology for Development: Semantic Web without the Web?

Semantic Technology for Development: Semantic Web without the Web?

Victor de BoerVrije Universiteit Amsterdam

With input from: Anna Bon, Christophe Gueret, Stephane Boyera , Nana Baah Gyan, Chris van Aart, Max Froumentin, Aman Grewal, Mary Allen, Amadou Tangara, Etienne Barnard, Hans Akkermans, Julie

Ferguson, Marije Visscher,…

Victor de Boer

Web & Media Group, CS, VU University Amsterdam

Netherlands Institute for Sound and Vision

Web Science

Cultural Heritage and Digital History

ICT for Development

CAUTION! DIGITAL DIVIDE AHEAD

Internet users per continent

Source: International Telecommunication Union (ITU)

Linked Data for Development?

Can the (Semantic) Web (be made to) meansomething for knowledge sharing even undervery constraining conditions?

No internet, no computer, no electricity

Multitude of languages, levels of literacy

http://worldwidesemanticweb.org

Example: Landportal

http://landportal.info and http://landportal.sbc4d.com

Published by FAO

Example: Agrovoc/Agris

http://linked-development.org/

Example: Institute for Development Studies’ Eldis database

ELDIS Link to GeoNames

IDS: document 0002 Country:”Gambia”

Geonames:Gambia

Region: Africa

population : 1593256

N 13° 30' 0'' W 15° 30' 0'

Links to DBPedia

IDS: document 0001 Theme:”Food Security”

DBPedia:”Food Security”

Analysis of approaches to understanding and addressing food security issues; examination of the structural causes of food insecurity and different policy responses

Theme:” Food aid emergencies ”

Person:”David Pimentel”

Organisation:”FAO”

“Voedselzekerheid”@NL

IATI Sector:”Higher Education”

Organisation : UN Habitat

Activity: Multi donor fund to support civil society in democracy related issues

Aid received (USD)

Geonames:Gambia

Worldbank:Gambia

N 13° 30' 0'' W 15° 30' 0'

Example: Linked Data for International Aid Transparency Initiative (IATI)

Kasper Brandt

http://iati2lod.appspot.com/4. How does violent conflict in recipient countries affect aid activities?

5. How does aid spending as registered in the IATI standard compare to World Bank indicators?

Hm..

Information sharing should be made

1. usable on small, affordable, hardware deployed in various connectivity contexts;

2. accessible to individuals with varied cultural backgrounds / literacy levels;

3. relevant and directly useful to the target public they aim to empower.

Infrastructure

Interface Relevancy

Infrastructure

RelevancyInterface

https://www.flickr.com/photos/worldbank/7826373720

Mobile phones

Community radio

Low-literacy

One of the grand challenges of ICT4DPrevalent among the rural poor

Speech: The most natural user interface

http

://ww

w.h

olisticlc.co

m/sp

eech

-therap

y/

M-agro Use Case Context: Regreening in Africa

Tominian regionMali

Market Data store

GSM/Voice interfaceWeb Interface

Text-To-Speech / ASR

Radio users

Field operative

Users

Information / Communique Services

“Slot and Filler” Text-to-Speech

English:

Bambara:

15 liters of offered by Zakari Diarra

15_ba.wav L_ba.wav Of_ba.wav

Spoken Language Elements Repository

honey

Talking to LOD

Web applications

<VoiceXML> to SPARQL

Voice browser

RadioMarché Linked market data

‘Allo, Linked Data?

DBpediaGeoNames

Agrovoc

rm:offering0001

rm:shea_butter

rm:1000

rdfs:label

rdfs:label “Amande de Karité”@fr

“Shea Nuts”@en

speakle:voicelabel_ba

rm:audio_shea_nl.wav

rm:audio_shea_ba.wav

speakle:voicelabel_nl

rdfs:label“1000”

speakle:voicelabel_ba

rm:audio_1000_nl.wav

rm:audio_1000_ba.wavspeakle:voicelabel_nl

rm:Mazankuy_Diarra

rm:kilo

rdfs:label “kilo”@en

speakle:voicelabel_ba

rm:audio_kilo_nl.wav

rm:audio_kilo_ba.wav

speakle:voicelabel_nl

rm:has_contact

Speakle voice labels

RadioMarché leads to new questions, new actions, new knowledge

Honey sales doubled in Tominian since RadioMarche;Sheabutter sales lagged behind, despite the same approach in RadioMarche

Source: Sahel Eco Amadou Tangara internalEvaluation report EVP – Radio Marché

Infrastructure

Interface Relevancy

Weather Service (Ghana, Mali, Burkina Faso)

DigiVet (N-Ghana)

Radio message service (Mali)

Poultry vaccination service (Mali)

Milk service (Mali

Seed market (Mali and Burkina Faso)

Veterinary service (N-Ghana)

Knowledge base and diagnosis system

CommonKADS

Information sharing across locations

DigiVet

Mr. Meteo

Mali Milk

Linked Data’s flexible data model fits the ``downscaling’’ aspect of CS4D well

Data is locally relevant……but might be shared across regions……across locally co-developed applications……and aggregated to country/global level

Interface Relevancy

Infrastructure

http://www.nytimes.com/2008/12/16/health/16incubators.html?pagewanted=1&_r=3&partner=permalink&exprod=permalink

Appropriate, sustainable hardware

affordable,

low-power-consuming,

repairable,

durable,

ownable,

Total Cost of Ownership (TCO)

Example: Efficient Knowledge sharing with SemanticXO and ERS

Christophe Guéret

ENTITY REGISTRY SYSTEM (ERS)

• Fully decentralised Linked Data publication platform• Works under any kind of connectivity context

• Tracks back individual edits back to their authors• Simple and versatile• Open Source https://github.com/ers-devs• Low resource demanding

... and open for contributions so don'thesitate to fork it!

Voice service: Multiple solutions

SIP over Ethernet

HTTP

Officerouterunning Asterisk

Orange Emerginov Platform

GSM dongle

DO

WN

SCA

LIN

G

Cheap, small hardware

Foroba Blon 2.0

Kasadaka

Low-resource knowledge sharing platform

Low-power, ubiquitous, cheap hardware FLOSS components

Rapid prototyping + deployment of (knowledge-intensive) services

User InterfacesVoice servicesSMS-based Visual

Wifi, 3g or GSM network

RDF Data store (Linked Data) allows for flexible data integration across applications, deployments

Kasadaka (“talking box”)

Andre Baart

Machine to machine information exchange

GSM network

SPARQL in an SMS

Enable (Semantic) Web data exchange over GSM networks.

Practical differences HTTP and SMS:SMS works with phone number, HTTP works with URLs.SMS has a size restriction, HTTP practically has none.SMS is one-way messaging, HTTP follows request-response.

Basic M2M communication based on SPARQL.

Onno Valkering

SPARQL in an SMSConverters to translate SPARQL HTTP request to SMS message (140 or 160 chars) and vice versa

CONSTRUCT, INSERT/DELETE DATA

ChallengesBlending synchronous and asynchronous messaging

SPARQL/ RDF compression

Unpredictable query result sizes

Compression for small datasets experiments

Strategies

Different serializations: RDF/XML, N-triples, Turtle, HDT1, EXI2

Compression (zip)

Assume shared vocabularies (top 20 from prefix.cc) and remove redundant (inferenced) triples (RDFS reasoning)

Evaluated on real-world datasets LOD Laundromat3

232,822 small datasets (1-1,000 triples)

[1] Fernández et al. “Compact Representation of Large RDF Data Sets for publishing and exchange” (ISWC 2010)[2] Käbisch et al. “Standardized and Efficient RDF Encoding for Constrained Embedded Networks” (ESWC2015)[3] http://lodlaundromat.org/ and Rietveld et al. “LOD Lab: Experiments at LOD Scale “ (ISWC2015)

TriplesN-triples +gzip

RDF/ XML

RDF/XML+Gzip Turtle

Turtle+Gzip HDT

HDT+ Gzip EXI

EXI+ Gzip

Best + vocabulary-based

1-10 50.7 103.8 77 102 70.3 495.5 180.1 57.5 65.9 44.2

11-20 22.5 62 27.1 50.5 24.2 122.2 47 23.3 24.9 18.9

21-30 16.2 58.2 18.5 48.7 16.3 79.5 31.1 16.5 17.5 13.6

31-40 28.3 69.1 30.9 62.1 28.6 86.5 40.7 28.2 29.1 23.5

41-50 9.8 51.2 10.2 42.3 8.6 38.1 14.8 9.3 9.7 8

51-60 17.2 59.2 17.5 50.1 15.9 50.5 22.8 15.8 16.3 8.7

61-70 11.8 58.5 12.4 42.4 10 43 17.7 11.1 11.6 6

71-80 8.8 54.8 8.5 40.9 7 31.6 11.2 7.5 7.8 6.4

81-90 6.7 52 6.3 40.6 5.1 25.4 9.1 5.8 6 4.4

91-100 8.1 54.9 7.6 40.4 6.2 26.9 9.7 6.8 7 5.7

101-200 8.8 62 8.3 39.2 6.7 24.7 10.1 7.6 7.9 5.7

201-300 4.8 50.8 3.6 39 2.8 13.4 4 3.6 3.6 2.7

301-400 4.8 51.5 3.3 37.7 2.5 11.4 3.3 3 3.1 2.5

401-500 4.4 51.5 2.9 37.4 2.2 10.4 2.7 2.6 2.7 2.2

501-600 5 53.8 3.4 38.7 2.5 8.9 3 2.9 3 2.4

601-700 4.1 51 2.5 35.9 1.7 8.5 2.2 2.3 2.4 1.7

701-800 4.5 51.1 2.7 36.2 1.9 8.1 2.1 2.4 2.4 1.9

801-900 4.4 51.1 2.6 36.4 1.8 7.9 1.9 2.3 2.3 1.8

901-1000 4.1 50.9 2.4 36.5 1.7 7.7 1.7 2.1 2.1 1.7

801-900 4.4 51.1 2.6 36.4 1.8 7.9 1.9 2.3 2.3 1.8

901-1000 4.1 50.9 2.4 36.5 1.7 7.7 1.7 2.1 2.1 1.7

For very small datasets (<40 triples), n-triples + gzip works best For larger datasets Turtle+gzip compresses bestRemoving redundancies using shared vocabularies adds additional compression

Compression experiments resultsNumber of SMSes

Avg. number of triples

1 0

2 3

3 8

4 16

5 24

6 84

7 98

8 126

9 189

10 301

For very small datasets (<40 triples), n-triples + gzip works best For larger datasets Turtle+gzip compresses bestRemoving redundancies using shared vocabularies adds additional compression

0

10

20

30

40

50

60

70

80

Co

mp

ress

ion

(p

erc

en

tage

wrt

n-t

rip

les)

Size of dataset in triples (binned)

N-triples +gzip

Turtle+Gzip

Best + vocabulary-based

Evaluation: 4 scenarios in 2 cases

Digivet and RadioMarche applications

Four scenarios / SPARQL queries in total

Results

Linked Data over Sneakernet / Driveby wifi

Master project Fahad Ali

Semantic Web over SMS is feasible using data compression + semantic background knowledge

economically feasible for ICT4D services for small datasets

Semantic Web without the Web is possible cf. IOT

Radio Data System (RDS), LoRa,..

Dev. countries can leapfrog directly into the information age,

jumping many phases of immature technologies

Img: flickr/n3v3rv0id

From ICT4D to Development Informatics

Img: flickr/n3v3rv0id

KR4D, LD4D, HCI4D, …

Test Computer Science hypotheses in domains/environments

Thank you

@victordeboer

v.de.boer@vu.nl

http://w4ra.org

http://worldwidesemanticweb.org

One more example

Aid accountability in Nepal

Marije VisscherSocial Scientist

Aske RobenhagenComputer Scientist

Citizens

Community frontline associates (CFAs)

Accountability Lab Office

Donors, Journalists

Surveys, focus groups. Collected in forms.

Reports from the field, solutions.

Aggregated reports, accountability

Example data

Citizens

Community frontline associates (CFAs)

Accountability Lab Office

Donors, Journalists

Smartphone - Voice System - SMS

Report1

हसकल्पनाम्यामआन्जिननयरहरुद्वारागाब

geo:Kavra

Report2

Knowledge graph