Mobile Development Meets Semantic Technology
-
Upload
blue-slate-solutions -
Category
Technology
-
view
1.755 -
download
1
description
Transcript of Mobile Development Meets Semantic Technology
© Blue Slate Solutions 2012 0
Mobile Development Meets
Semantic Technology
David S. Read
CTO, Blue Slate Solutions
Semantic Technology and Business Conference
June, 2012
© Blue Slate Solutions 2012
Introductions
• David Read, CISSP, GSEC, SCJP
– Blue Slate CTO and Chief Solution Architect
– 25+ years tech leadership experience
– Strategy and technology focus
– Passionate about semantic technology
and data mining
• Attendees
– Business?
– Technology?
– Production Use of Semantic
Technologies?
– Mobile Part of Business Vision?
1
Source: http://crystalwashington.com/wp-content/uploads/2011/07/boringconference.jpg
© Blue Slate Solutions 2012
What is mobile?
2
© Blue Slate Solutions 2012
Agenda
• Premise and Goal
• Mobile constraints
– not focused on the UI
• Semantic technology components
• Leveraging semantic flexibility
3
© Blue Slate Solutions 2012
Premise
4
© Blue Slate Solutions 2012
Multi-tier Architectures – Flexible at Design Time
5
5
Data
Tier
Integration
Tier
Business
Tier
Presentation
Tier
• Persistence
• Transactions
• Categorization
• Indexing
• Security
DB
ESB
Unstructured
• Data abstraction
• DAO
• Transactions
• Security
• e.g. Hibernate,
XSL, PL/SQL
• Aggregation
• Business Rules
• SDO
• Work Flow
• Transactions
• Security
App3 App4 App2 App5
.Net iOS Android JSP • View
• Layout
• Security
Object Domain
WF RE Data
Native/Proprietary Domain
O-R O-XML O-Unstructured
Native/Proprietary Domain
WF
WS
App1
RE
© Blue Slate Solutions 2012
Goal
Use
semantic technology
to mitigate mobile
platform constraints
at runtime
6
© Blue Slate Solutions 2012
MOBILE CONSTRAINTS
7
© Blue Slate Solutions 2012
Application Memory
• Limits on runtime application and heap
– Android devices
• 16, 24 or 32 MB per application
• Manifest: android:largeHeap=TRUE (ouch!)
– iOS devices
• Kill the application between 16 and 40 MB
– BlackBerry devices
• (COD files) 8 MB application and 8 MB for resources
– Windows CE/Mobile
• Older (v 5 and 6) limited to 32M, v7 peaks at ~90 MB
8
© Blue Slate Solutions 2012
CPU
9
Source: http://www.passmark.com/forum/showthread.php?t=3381
© Blue Slate Solutions 2012
Bandwidth
10
Sources: http://www.diffen.com/difference/3G_vs_4G
and http://www.hostile.org/coredump/bandwidth.html
10baseT
© Blue Slate Solutions 2012
Connectivity
11
• Verizon, AT&T and Sprint Coverage
• Cellular coverage has gaps
• 4G is far from the norm
Sources: http://www.verizonwireless.com/b2c/CoverageLocatorController,
http://www.wireless.att.com/coverageviewer/#,
http://coverage.sprintpcs.com/IMPACT.jsp?covType=sprint&id9=vanity:coverage
© Blue Slate Solutions 2012
Battery Life
• Talk and standby time are common
(but not meaningful) measures
• Surfing the web, playing games, checking email, all
require significant power: radio, CPU and screen
12
Source: http://www2012.wwwconference.org/proceedings/proceedings/p41.pdf
© Blue Slate Solutions 2012
SEMANTIC TECHNOLOGY
COMPONENTS
13
© Blue Slate Solutions 2012
Data
• Facts from remote and local sources
• May be structured or unstructured
• More value as it is federated
• Standard semantic representation - RDF
14
Subject Object Predicate
© Blue Slate Solutions 2012
Ontology
• Classification and rules (inferencing)
• Transform at any tier
– Extrapolate redundant information
• Standard representation - RDFS and OWL
15
Source: http://data.gov.uk/resources/payments
© Blue Slate Solutions 2012
Reasoner
• Software which applies the ontology to data
• Can assert new data
• Well understood technology
– Available on a broad range of platforms, including mobile
16
Rea
so
ne
r
© Blue Slate Solutions 2012
Query Processing
• CRUD operations on data
• Standard semantic query language - SPARQL
17
ObjectA ObjectB Relation2
ObjectC
Constant
Relation4
Relation1
Relation2
Relation3 ObjectD
ObjectE
SPARQL
Processor
Local or Remote
?theRelation ?theObject
Relation3 ObjectD
Relation4 ObjectE
© Blue Slate Solutions 2012
LEVERAGING SEMANTIC
TECHNOLOGY, MOBILE STYLE
18
© Blue Slate Solutions 2012
Levers to Minimize Power Consumption
19
Remote Data Access
Local Computation
© Blue Slate Solutions 2012
Use Efficient Representations
• Bandwidth is variable
• Radio communication costs battery significantly
• Little entropy per bit in most non-binary formats
– XML-based information often contains less data than
markup (far less than 1 bit entropy per byte)
20
Format Size (KB) %
RDF/XML 693 100
Turtle 258 37
Zipped RDF/XML 33 5
Zipped Turtle 27 4
© Blue Slate Solutions 2012
Reasoning Locally to Minimize Power Consumption
• CPU on phone is often underutilized
• Application behavior may have user-specific
configurations controlling data relationships
• Reasoning result at server is equivalent to client
– Same syntax
21
Network
Access Process
Execution
?
© Blue Slate Solutions 2012
Reasoning Locally – An example
• Application to report on fuel consumption statistics
• Small ontology sets up relationships between
vehicles, fuel purchases and gas stations
• Data for all fuel purchases
• Classify cars and gas stations, infer additional
information (e.g. distance, MPG)
22
© Blue Slate Solutions 2012
Simple Ontology and Data
23
© Blue Slate Solutions 2012
Initial Triple Count and Combined Inferred Assertions
24
© Blue Slate Solutions 2012
What Just Happened?
25
Local
Data
Local
Ontology
Reasoner Augmented
Local
Data
Local
Query
Engine
DBpedia
Endpoint
Dbpedia
Data
Federated
Results
Local
Query
Mobile Device Remote
(Cloud)
1
1
2
3
4
5
6
7
8
© Blue Slate Solutions 2012
How Does That Work on a Mobile Device?
• Semantic Reasoner and Query Libraries
– Several available (commercial and open source)
– Using Java (Android)-based Open Source libraries for
this demonstration
• Principles are the same for any semantic library
– Load an ontology and instances into working storage
• Might exist locally or be loaded via network
– Run the reasoner to create a model
– Query the resulting model to obtain result sets
• Inferred data can be persisted to create a new local
data set
26
© Blue Slate Solutions 2012
A Mobile Reasoner and Query Processor
• Jena
– Java-based semantic library
• Started by HP Labs as open source project in 2000
• Apache project (incubator) since 2010
– Interfaces to reasoner, triple store
– Built-in reasoner (RDFS, OWL DL)
– http://incubator.apache.org/jena/
– ARQ
• Jena’s SPARQL processor
• Android ports
– Androjena and ARQoid
– http://code.google.com/p/androjena/
27
© Blue Slate Solutions 2012
USING REASONING FOR
TUNING
28
© Blue Slate Solutions 2012
Reasoner Has No Expectations of an Ontology
• The process and libraries we discussed do not
require an ontology to be provided in order to
create a model
• If only data is provided then the reasoner creates a
model containing the data
• Allows us to tune behavior from the server side
without separate code bases
– Dynamically balance network bandwidth usage versus
CPU
– Server load, bandwidth limitations, mobile device
limitations, …
29
© Blue Slate Solutions 2012
Simple Example of Real-time Tuning
In our fuel consumption application…
30
Partial Data
and Ontology
Component
Full Data
Network
Local CPU
Server CPU
© Blue Slate Solutions 2012
Fuel Data and Ontology Overview
• Using raw fuel purchases as our representative
data set
– SPARQL endpoint:
http://semantic.monead.com/vehicleinfo/mileage
– Fully inferred data set: http://monead.com/semantic/data/
HybridMileageOntologyAll.Inferenced.xml
• Sizing
31
Information RDF/XML
Size (KB)
Turtle
Size (KB)
%
Ontology 6 2 1
Minimal Data 200 68 27
Fully Inferenced 692 257 100
© Blue Slate Solutions 2012
Fuel Data and Ontology Details
veh:fuelPurchase0381 a veh:FuelPurchase; veh:vehicle pveh:car2; veh:date "2011-08-14";
veh:gallons 6.908; veh:usDollarsPerGallon 3.779;
veh:totalUsDollarsCharged 26.11;
veh:reportedMpg 45.4; veh:odometerMiles 106883.;
veh:purchaseStation veh:Hess1.
veh:Stoughton-MA a veh:Place;
owl:sameAs
<http://dbpedia.org/resource/Stoughton,_Massachusetts>;
veh:placeName "Stoughton, MA".
veh:Sunoco19 a veh:GasStation;
veh:stationName "Sunoco";
veh:brandInfo
<http://dbpedia.org/resource/Sunoco>;
veh:location veh:Glenville.
32
© Blue Slate Solutions 2012
Use Case R-1: Slow Network
33
Mobile
Device Server 1.5MBPS
Component Full Data Set (Sec) Partial Data Set (Sec) Difference (Sec)
Data 171 44 127
Ontology 0 2 -2
Radio Total 171 46 125
Rendering 0 15 -15
Total 171 61 110
© Blue Slate Solutions 2012
Use Case R-2: (Faster Network)
34
Component Full Data Set (Sec) Partial Data Set (Sec) Difference (Sec)
Data 17 4 -13
Ontology 0 1 1
Radio Total 17 5 -12
Rendering 0 15 15
Total 17 20 3
Mobile
Device Server 15 MBPS
© Blue Slate Solutions 2012
Create a Reasoner Instance
35
OntModel model;
Reasoner reasoner =
ReasonerRegistry.getOWLReasoner();
Model infModel =
ModelFactory.createInfModel(reasoner,
ModelFactory.createDefaultModel());
model = ModelFactory.createOntologyModel(
OntModelSpec.OWL_DL_MEM,infModel);
model.setStrictMode(false);
© Blue Slate Solutions 2012
Process the Ontology (Run the Reasoner)
36
inputStream =
new StringReader(“Your Ontology”);
model.read(
inputStream,
null,
“Turtle”);
© Blue Slate Solutions 2012
Client Behavior is Unchanged for Either Use Case
37
Request
Data Reasoner
Data
Query
Engine Results
1 2 3
4
5
Server Client
?
Full
Data
Set
Partial
Data
Set and Ontology
© Blue Slate Solutions 2012
USING QUERYING FOR
TUNING
38
© Blue Slate Solutions 2012
SPARQL Results Not Concerned with Data Source
• A SPARQL query can access a local model, remote
data and/or remote SPARQL endpoint
• The results are processed in the same manner
– Another tuning opportunity
• Dynamically shift between server and client
– Network/Radio and CPU (federation)
– CPU (sorting, filtering)
39
© Blue Slate Solutions 2012
Query Execution Behavior – 2 Approaches
• Server can send queries to client device
• Client decides which query to use based on
metadata from server or its own measurements
40
© Blue Slate Solutions 2012
Query Execution Behavior – Server Decides
41
Request
Query Data
Query
Engine Results
1
2 3
4
Server Client
?
Queries
Data 3
© Blue Slate Solutions 2012
Query Execution Behavior – Client Decides
42
Data
Query
Engine Results
1
3
4
Server Client
Data 3
Queries ? 2
© Blue Slate Solutions 2012
Use Case Q-1 (Local Model)
• Query executes against local model
• If query requires a lot of CPU for (sorting, filtering)
could be better off re-architecting to server
43
Data
Query
Engine
Results
Client
Queries
© Blue Slate Solutions 2012
Setup a SPARQL Query Against Local Model
44
QueryExecution qe;
String query = “Your SPARQL Query”;
qe = QueryExecutionFactory.create(
query, model);
© Blue Slate Solutions 2012
Use Case Q-2 (Single SPARQL Endpoint)
• Query executes against one SPARQL endpoint
• Have client device execute this directly
• One consideration: significant latency
45
Query
Engine
Results
Client
Queries Data
Query
Engine
Server
SPARQL
Endpoint
© Blue Slate Solutions 2012
Use Case Q-3 (Federated SPARQL Endpoints)
• Query executes against several SPARQL
endpoints
• Requires communications with all the endpoints
and integration of the results
46
Query
Engine
Results
Client
Queries
© Blue Slate Solutions 2012
Use Case Q-3 (Federated SPARQL Endpoints) - Alt
• If significantly complex, makes sense to proxy
47
Query
Engine
Results
Client
Queries
© Blue Slate Solutions 2012
Setup a SPARQL Query Against Remote Endpoint
48
QueryExecution qe;
String query = “Your SPARQL Query”;
String queryUri = “Some SPARQL Endpoint URI”;
String queryDefaultGraphUri =
“An Optional Graph Uri”;
if (queryDefaultGraphUri.length() > 0) {
qe = QueryExecutionFactory.
sparqlService(queryUri, query,
queryDefaultGraphUri);
} else {
qe = QueryExecutionFactory.
sparqlService(queryUri, query);
}
© Blue Slate Solutions 2012
Use Case Q-4 (Raw Data Sources)
• Query executes against semantic data sources
• Server aggregates results since the entire graph
typically needs to be brought across the network
49
Query
Engine Results
Client
Queries
Query
Engine
Server
SPARQL
Endpoint
Data
Server
© Blue Slate Solutions 2012
Setup a SPARQL Query Against Remote Graph
50
/* e.g. query contains FROM or SERVICE */
QueryExecution qe;
String query = “Your SPARQL Query”;
qe = QueryExecutionFactory.create(query);
© Blue Slate Solutions 2012
Query Processing Behavior is Unchanged
for Any Use Case
• The query execution syntax differs for local versus
remote data sources
• The result set, however, is processed in the same
manner for any select query
• Results in a small amount of code to execute the
correct query form (the three preceding code
snippets) but all downstream code is consistent
51
© Blue Slate Solutions 2012
Retrieve the SPARQL Results
52
ResultSet resultSet = qe.execSelect();
List<String> colNames=results.getResultVars();
for (String colName : colNames) {
while (results.hasNext()) {
QuerySolution solution = results.next();
for (String var : columnNames) {
if (solution.get(var) != null) {
if (solution.get(var).isLiteral()) {
solution.getLiteral(var).toString();
} else {
solution.getResource(var).getURI(); }
}
}
}
qe.close();
© Blue Slate Solutions 2012
Caching
• Caching is a good option with mobile devices
– Cache the data received, assertions reasoned and
results obtained
• Typically have access to “private” storage space,
often located on removable storage (SD card)
• Doesn’t interfere with basic phone data (local,
phone) storage
• Not as limited as native storage
– For BlackBerry this is the only reasonable way to break
out of the 8MB application data limitation
• Mature set of caching libraries do most of the
interesting work for you
53
© Blue Slate Solutions 2012
Summary
Semantic technology, by virtue of inferencing,
platform independence, consistent syntax and
standard protocols, enables dynamic intra-tier
tuning without significant coding and
configuration.
54
© Blue Slate Solutions 2012
Thank You
• I appreciate your taking the time
to attend this session
• Contact and Business
– www.blueslate.net
• Reference Information
– Semantic technology thoughts and work
• http://monead.com/semantic
– Sparql Droid
• https://play.google.com/store/apps/details?id=com.monead.sema
ntic.android.sparql&hl=en
55