DataBearings: A semantic platform for data integration on IoT, Artem Katasonov
-
Upload
vtt-technical-research-centre-of-finland -
Category
Data & Analytics
-
view
100 -
download
0
description
Transcript of DataBearings: A semantic platform for data integration on IoT, Artem Katasonov
DataBearings: A Semantic Platform forData Integration on IoT
Artem Katasonov(VTT Technical Research Center of Finland)
226/09/2014
Business Needs• Need: Companies have increasing number of own databases and various other
in-house / external (business partners, Open Data) data sources.• Need: Companies want to exploit ever-growing and diverse data efficiently and
dynamically for new and better services.• Need: In the market, there is a great need for novel applications and better
capability to provide novel services to customers in order to differentiate andcompete.
• Need: Companies are looking data management solutions that allow reducingdevelopment and maintenance / extension costs.
326/09/2014
4 approaches:• Integrated packages (e.g. SAP)• Messaging (ESB i.e. WS-* based, etc.)• Data warehouses (Extract-Transform-Load approach)• Enterprise Information Integration (EII) – integration without first loading into
a warehouse, i.e. “on the fly”
Points for EII:• Access to “live” data
• Internet of Things (sensors, RFID) makes a good case for it.• Reduce costs by allowing leveraging existing data sources in new ways,
avoiding data replication with hardware, software and human costs.• Enables integration with external sources (warehouses cannot help here).• Allow fast and iterative "trying out" new data sources, new processing
pipelines, or new distribution channels (when not 100% sure that is beneficial forbusiness).
Background: General Approaches to Data Integration
426/09/2014
Commercial EII tools:• Heavy and expensive.• Some work only with databases, not Web services, etc.• Relational approach:
• Need to manually define a schema that integrates the schemas of theunderlying data sources.
• Such a federation view is harder to modify later.• Do not include data processing functionality (only federate data, post-processing
has to be done elsewhere).• Do not support data updates (leaving that to EAI tools).
Drawbacks of Non-semantic EII
526/09/2014
Semantic EII in DataBearings: How it Works
DatabaseQuerydecomposition
File
Querymultiplication
Sub-queriestranslation
Queryanalysis
Single high-level query
S-Q1
S-Q2
S-Q3
SQL
SOAP /REST
GET /local IO
Custom datapost-processing(incl. formatting)
WebService
Join /UnionSingle answer
Result 1
Result 2
Result 3
Webserver
Queryreformulation
Low-levelquery
Resultsfiltering
626/09/2014
A DataBearings-based solutionsupplies data to “Street ParkingEnforcement” mobileapplication:• Integrates data from various
payment providers• ‘Pay and display’
machines.• Mobile payment
services (EasyPark,Parkman, etc.).
726/09/2014
A DataBearings-based solutionsupplies data to CarP:• Integrates static (manually-
managed) data anddynamic data (fromsensors).
• Integrates data fromdifferent Finnish cities(different systems in use forstatic and dynamic data).
• Delivers data in Datex IIformat
826/09/2014
Access
DATEX II publication
Jyväskylä static data(MS Excel document)
push
Tampere static data(NettiParkki, SOAP Web
service)Jyväskylä dynamic data
(Designa, proprietaryinterface )
Pirkkala dynamic data(Designa, proprietary
interface)
Pirkkala VMS(FLS Rosign, proprietary
Web service)
Jyväskylä VMS(Designa, proprietary
interface)
Tampere dynamic data(PlatformX, JSON Web
service)
Tampere extra staticdata
(PlatformX, JSON Webservice)
Integrationpush
push push
timed pull timed pull
timedpull
query-timepull
Forwarding
Parking Guidance mobileapp
Data Integration for Datex II, CarP and related
926/09/2014
Currently, SPoT is a single datasource service (video-basedplate recognition in car parks).
A DataBearings-based solutionis under development to extendSPoT:• Integrate the currently used
data with street parking datafrom various sources.
1026/09/2014
Semantic Data Abstraction (via Query Reformulation)
Without (dataas it is):
With(interpreteddata):
1126/09/2014
Another DataBearings Pilot: Smart Home
1226/09/2014
Controlling Actuators as Data Updates
1326/09/2014
“Citizen Decision Making”
1426/09/2014
“Citizen Actuation”
Or, just close thedoor.
1526/09/2014
Enquiries:• Weather outside, from FMI service• Light condition outside, from FMI service• Temperature inside, from ThereGate• Power consumption at the Audio/Video equipment power outlet, from ThereGate• State of Audio/Video (Sleep, Standby, Music, TV, PS3), inferred based on above power consumption
Commands:• Light on/off (2 lamps), via ThereGate• Pay music, via Spotify on a PC
Other:• Inform (manually) that everyone left house• Inform (automatically, based on visibility of home WiFi) that a particular person came or left• Receive personal Welcome home, X! and Bye, X! messages
Automation:• IF Nobody home AND Somebody came home THEN
• WHEN It is dark enough outside THEN Switch a light on• IF Everyone left the house THEN Ask if to switch all the lights off• IF It is night AND State of Audio/Video changed to ‘Standby’ THEN Switch the lights off (in our case, always means
going to sleep)• IF Wardrobe door is left open AND Somebody is home THEN Alarm via repeatedly switching a light on/off AND Send
a message• WHEN Door is closed OR Acknowledged from phone OR 1 minute elapsed THEN Stop alarm
Smart Home Pilot Functionality
1626/09/2014
Foundation: Semantic Agent Programming Language (S-APL)
S-APL
N3Logic – Tim Berners-Lee et al. use of N3 torepresent production rules, allowing data and rules to bewithin same document or model.
Notation3 (N3) – Original and current view of Tim Berners-Lee on whatRDF should have been. Basically, RDF with nesting.
Resource Description Framework (RDF) –W3C standard
:Hamppi :occupancy 200
{:Hamppi :occupancy 200} :source :Designa
{:Hamppi :occupancy ?x. ?x > 960} =>{:Hamppi :is :full}
• Much more expressive rules.• Can remove data - allows dynamics.• Allows “procedural”-like programming
(equivalents to variables, if-then-else,cycles, functions).
• Can execute Java components.
1726/09/2014
S-APL Example
{{
{ ?parking :occupancy ?o; :capacity ?c} :source :Designa. ?o = ?c} => {
?parking :is :full .
{s:I s:do j:sapl.share.PrintBehavior} s:configuredAs{p:text s:is "?parking is now full"} .
{{ ?parking :occupancy ?new } :source :Designa. ?new < ?o } => {s:I s:remove {?parking :is :full}
}}
} s:is s:Rule
1826/09/2014
DataBearings Architecture
DataScripts
Annota-tions
Reusable Atomic Behaviors (RAB)
S-APL engine
“Make Your Data Flow Smooth”
1926/09/2014
EII Functionality : Universal Adapter RAB
SQL plugin
SOAP plugin
XML plugin
JSON plugin
…Uni
vers
alad
apte
r
Busin
essc
ase
logi
c
SemanticQuery
SemanticData
Data sourceannotations
SQL
SOAP
HTTP GET
HTTP GET
2026/09/2014
pdb:Dummy a o:Ontonut; o:type "sapl.shared.eii.JSONOntonut"; o:service "http://localhost:8080/FinnparkDummyProvider/Dummy?plate=%%plate%%"; o:semantics {
{* d:tree {
[d:attribute "vehicles"] d:branch {[d:attribute "plate"] d:value ?plate .[d:attribute "country"] d:value ?country .[d:attribute "zone"] d:value ?areaID .[d:attribute "start"] d:value ?start .[d:attribute "end"] d:value ?stop .
}.}
} => {[a fp:ParkingEvent]
fp:vehicle [fp:hasPlate ?plate; fp:hasModifier ?country];fp:parking [fp:hasID ?areaID];fp:start ?start; fp:end ?stop;
}}; o:getPattern { * a fp:ParkingEvent }
Data Source Annotation Example
2126/09/2014
As compared to non-semantic EII solutions:• Lightweight, Cheaper• More powerful: better suited for handling heterogeneity and multitude of data sources.• Future-proof: it is much easier to extend the system later to support N+1th data source
or M+1th data processing case.• Integrated: combines data federation and data pipeline capabilities as well as supports
data updates: data can be accesses from multiple sources, processed as needed anddelivered to the intended destination, all within a single platform.
As compared to the ETL (extract-transform-load) approach to introducing semantic datamanagement:• Easier transition: can keep data where it was, no need to transfer data into semantic
databases.• Higher performance: semantic databases typically do not handle big amounts of data
well.• Natural integration of 3rd party data sources: there is typically no control over those,
cannot ask to move to semantic representation.
Summary: Benefits of DataBearings