From research to business: the Web of linked data
-
Upload
irene-celino -
Category
Technology
-
view
2.011 -
download
1
description
Transcript of From research to business: the Web of linked data
From research to business: the Web of linked dataEnterprise X.0/Econom Workshops @ BIS 2009 – Poznan, 29th April 2009 - © CEFRIEL 2009
From research to business: the Web of linked data
Irene Celino – Semantic Web PracticeCEFRIEL – ICT Institute, Politecnico di Milanoemail: [email protected] – web: http://swa.cefriel.it
Poznan, 29th April 2009 – © CEFRIEL 20092Irene Celino – From research to business: the Web of linked data
AgendaAgendaThe problem of integration
Web as a platformLinked data
How do we produce linked data today?The case of Service-Finder
How do we manage linked data today?The case of Urban Computing in LarKC
What’s next?What’s already going on Business viewScientific & technical view
Poznan, 29th April 2009 – © CEFRIEL 20093Irene Celino – From research to business: the Web of linked data
The problem of integrationThe problem of integrationWhen do we have an integration problem?
Very large amounts of data that grow and evolve continuously
problem of scalescaleNumerous and different data typologies (documents, media, email, Web results, contacts, etc.)
problem of data data heterogeneityheterogeneity
Numerous and differentinformation systems (DB, legacy systems, ERP, etc.)
problem of system system heterogeneityheterogeneity
Poznan, 29th April 2009 – © CEFRIEL 20094Irene Celino – From research to business: the Web of linked data
When 1 + 1 > 2 ?When 1 + 1 > 2 ?Data integration always gives an added valueadded value
Getting a global high-level view
Sharing knowledge
Business opportunities
Business Intelligence
Still there is the technologicaltechnological problemproblem: How to reconcile data heterogeneity?
Who took advantage from integration?
Can (Semantic) Web be of help?
Poznan, 29th April 2009 – © CEFRIEL 20095Irene Celino – From research to business: the Web of linked data
Lesson learned from Web 2.0Lesson learned from Web 2.0Participation politicsParticipation politics and “wisdom of the crowds”
Great success of mashmash--upsupsMash-ups: applications made up of light integration of artifacts provided by third parties (often API or REST services)New integration paradigm to application development
Publication and access via Webvia WebStoring our information on the Web is becoming easier and easierAccessing our information on the Web (e.g. by retrieving it with search engines) is becoming more and more frequent
Poznan, 29th April 2009 – © CEFRIEL 20096Irene Celino – From research to business: the Web of linked data
The Web as integration platformThe Web as integration platformWhat if we integrate on the Webintegrate on the Web?
Web as a platformData prosumer (producer + consumer)
“Web of DataWeb of Data”From current “Web of Documents” to a Web of dataNot only information retrieval, but also data retrieval
Exposing your dataExposing your data on the WebConverting/translating to a suitable format“Wrapping” the data source
D2RD2R VirtuosoVirtuoso
SquirrelRDFSquirrelRDF
SPASQLSPASQLRelational.OWLRelational.OWL
DartGridDartGridSPOONSPOON
TriplifyTriplify
R2OR2OTalisTalis
Poznan, 29th April 2009 – © CEFRIEL 20097Irene Celino – From research to business: the Web of linked data
Linked data and data cloudLinked data and data cloudLinked DataLinked Data
The realization of the “Web of Data” (and of the Semantic Web)Tim Berners-Lee: http://www.w3.org/DesignIssues/LinkedData
Linking Open Data InitiativeInitiativeA community publishing and linking data on the Webhttp://linkeddata.org/
Data cloudData cloudToday everybody talks about cloud computing
However, often it’s not only a computation or storage issue, but it also about data and knowledge management
Poznan, 29th April 2009 – © CEFRIEL 20098Irene Celino – From research to business: the Web of linked data
Challenges for linked dataChallenges for linked dataAutomatic linked data creation and linkageAutomatic linked data creation and linkage
Automatic generation of linked data and smart mechanisms to identify “contact points” between different data sources and to seamlessly link them
Distributed queryingDistributed queryingQuerying distributed data over different Web sources regardless the “physical position” of data and getting aggregated results
Distributed reasoningDistributed reasoningApplying inference techniques to distributed data, preserving consistency and correctness of the reasoning
From research to business: the Web of linked dataEnterprise X.0/Econom Workshops @ BIS 2009 – Poznan, 29th April 2009 - © CEFRIEL 2009
Service-Finderhttp://demo.service-finder.eu
There’s a lot of information already on the Web: how can we turn it into linked data?
Poznan, 29th April 2009 – © CEFRIEL 200910Irene Celino – From research to business: the Web of linked data
Context: SOA onto the WebContext: SOA onto the WebService Oriented Architectures (SOAs) along with Web Services technologies are widely seen as the most promising fundament for realizing service interchange in business to business settings.However, it is envisioned that SOAs andWeb Services will increasingly move outof these settings and out onto the Web.Web size
Google: 1.000.000.000.000 URIs (08/2008)NetCraft: 62.000.000 active hosts
Service Web sizeGoogle: filetype:asmx inurl:wsdl (818)Service-Finder: > 25.000
[ http://aws.amazon.com/ ]
[ http://developer.ebay.com/ ]
Poznan, 29th April 2009 – © CEFRIEL 200911Irene Celino – From research to business: the Web of linked data
One of the essential building blocks for creating applications that utilize the vast quantities of services, which are available on the Web is making it easier to discoveryand select the right servicesUDDI was initially proposed as a component of Web Services usage process enabling registering and discovering services, but finally UDDI did not reach its expected potentialThe critical problem in this new Web oriented environment is one of scalebecause services appear, disappear and change at a rate much higher than in business to business settings
UDDI Business Registry Shutdown. "With the approval of UDDI v3.02 as an OASIS Standard in 2005, and the momentum UDDI has achieved in market adoption, IBM, Microsoft and SAP have evaluated the status of the UDDI Business Registry and determined that the goals for the project have been achieved. Given this, the UDDI Business Registry will be discontinued as of 12 January 2006." [from “Registering for UDDI” 2005-12-17 ][see http://xml.coverpages.org/uddi.html ]
The rise and fall of public UDDI registriesThe rise and fall of public UDDI registries
Poznan, 29th April 2009 – © CEFRIEL 200912Irene Celino – From research to business: the Web of linked data
Pitfalls of public UDDI registriesPitfalls of public UDDI registries1. UDDI is centered around programmatic access to the registry and
only a few mostly technically focused user interfaces are available.
2. The information in public UDDI registry was often outdated. Thevalue of the service in the public UDDI registry is minimal if the service itself does not exist anymore.
3. There are no means for community feedback. Practically there is only one possibility to provide feedback allowing the user to contact a provider by email listed in the service description.
4. A WSDL definition and a short description is not sufficient for a service consumer to select a service. To make decision about applicability of the service, service consumer need to become familiar with pricing, terms and condition, service level agreements to name just a few.
Poznan, 29th April 2009 – © CEFRIEL 200913Irene Celino – From research to business: the Web of linked data
Overcoming UDDI limitationOvercoming UDDI limitation1. Easy to use GUI – It is important that early adopters of Web
Services technology, who learns about it for the first time, should be able to start exploring it with a few simply steps
2. Search Engine style – Web is unpredictable and services can appear and disappear (the same as websites), but one can put up a mechanism (periodic crawling and availability check) allowing to eliminate these services which are not available anymore
3. Architecture of participation – Learn from Web 2.0 (e.g., wikis, blogs, etc.) in enabling community contribution
4. More useful info – Include all information required by a user to make decision about applicability of the service; e.g., pricing,terms and condition, service level agreements, etc.
Poznan, 29th April 2009 – © CEFRIEL 200914Irene Celino – From research to business: the Web of linked data
project ideaproject idea
SemanticsKnowledge Representation
& Reasoning
Web ServicesAs a basic tool to implementa Service Oriented Architecture
Semantic Web ServicesAs a means to realize
Service Oriented Architecture
Web 2.0User clusteringUser-Resource correlation
Semantic SearchConceptual Indexing
Semantic Matching
AutomaticSemantic Annotation
Combining smart-machine and smart-data
Service-Finder aims at developing a platform for service discovery in which Web Services are embedded in a Web 2.0 environmentService-Finder aims at developing a platform for service discovery in which Web Services are embedded in a Web 2.0 environment
Realizing Web Service Discovery at Web Scale
http://demo.service-finder.eu
Poznan, 29th April 2009 – © CEFRIEL 200915Irene Celino – From research to business: the Web of linked data
key objectiveskey objectives
Create a Semantic Search Engine for Web ServicesAggregates information from heterogeneous sources: WSDL, wikis, blogs and also users’ feedbacks and behaviourCreate a Web Service Crawler to identify Web Services and their relevant information
Automatically generate Semantic Service Descriptionsby analyzing heterogeneous sourcesAllow efficient and effective search of collected and generated dataProvide a Web 2.0 portal
To support users in searching and browsing for Web ServicesTo give recommendations to usersTo track user behaviour for improving accuracy of service searchand user recommendations
Create a Semantic Search Engine for Web ServicesAggregates information from heterogeneous sources: WSDL, wikis, blogs and also users’ feedbacks and behaviourCreate a Web Service Crawler to identify Web Services and their relevant information
Automatically generate Semantic Service Descriptionsby analyzing heterogeneous sourcesAllow efficient and effective search of collected and generated dataProvide a Web 2.0 portal
To support users in searching and browsing for Web ServicesTo give recommendations to usersTo track user behaviour for improving accuracy of service searchand user recommendations
Poznan, 29th April 2009 – © CEFRIEL 200916Irene Celino – From research to business: the Web of linked data
RealizingRealizing____________
June 2008
Jan 2008
Dec 2008
Dec 2009
Today
Poznan, 29th April 2009 – © CEFRIEL 200917Irene Celino – From research to business: the Web of linked data
Use cases forUse cases for____________To gather requirements we imaged several use cases
A system administrator at a bank who is looking for an SMS Messaging service that sends him an SMS in any case failures with the on-line payment system of the bankA business and technology consultant working on a e-health project that needs to make it possible for general practitioners to send and receive fax directly from their patient record application using an on-line serviceA web developer that, after using a service listed on Service-Finder, decides to edit the information on the portal in order to improve it for other community users
Poznan, 29th April 2009 – © CEFRIEL 200918Irene Celino – From research to business: the Web of linked data
RequirementsRequirements forfor ___________We identified within those previous use cases more than 60 requirements and we grouped similar requirements together into three main categories:
Search related: search for text, search for tag, search for concept, disambiguation, facet-browsing, ranking, sorting, comparing, etc.Web Service information related:
Services details: interface, how can the service be used, its payment modalities, its terms and clauses, user-added information as ratings, comments and tags, measured values of service levels such as availability (uptime) or performance (response time) and the service level declared by the provider. Providers info: name of the provider and its references, user-added information as ratings, comments and tags
User Community related: rating, commenting, tagging, editing, writing wiki entries, registration, recommendations
Poznan, 29th April 2009 – © CEFRIEL 200919Irene Celino – From research to business: the Web of linked data
Architecture and ComponentsArchitecture and Components
Poznan, 29th April 2009 – © CEFRIEL 200920Irene Celino – From research to business: the Web of linked data
Key innovations ofKey innovations of ___________Research Activities
Automatic Service Annotation
To automatic create Web Service descriptions by analyzing WSDL and related information• coping with contradictions• using community process to verify results
User and Service Clustering
To investigate and implement techniques for:• clustering users accordingly to their behaviours• clustering services accordingly to their usage by users belonging to the same clusters
Research and Engineering Activities
Conceptual Indexing and Matching
To apply semantic technologies in the Web Service discovery domainTo adopt them to the new forms of input descriptions:• Automatic annotations, clusters, contexts
Integration Activities
Service-Finder Portal
To provide a Web 2.0 portal• demonstrating the developed technologies• fostering communities participation
Poznan, 29th April 2009 – © CEFRIEL 200921Irene Celino – From research to business: the Web of linked data
Beyond state of the artBeyond state of the artFeature State of the art Improvement
Architecture for lightweight semantic service discovery
Approaches based on a registration process or an editorial team
Enables to scale service discovery with the upcoming increase of publicly available services
Largest and most accurate set of publicly available services
Specialized portals only containing subset of services
Focused crawler able to identify services and related information
Automatic metadata creation for Web Service
Innovative; under-researched Metadata generation from Web 2.0 data and services
Integration of formal and informal (textual) knowledge
Indexed textual descriptions Hybrid match-making algorithm
Automatic creation of both user and service clusters
Only general-purpose clustering techniques exist
Specialize clustering algorithms that jointly cluster users and services
Innovative interface that combines Web 2.0 features and service related features
Current Web 2.0 portals do not include semantic metadata.
Techniques that enable handling of semantic metadata in Web 2.0 portals
Poznan, 29th April 2009 – © CEFRIEL 200922Irene Celino – From research to business: the Web of linked data
Expected ImpactsExpected ImpactsService-Finder provides core mechanisms to cope with changing environments:
It uses Web principles such as openness and robustness;It takes explicit and implicit user interaction for construction, improvement and validation of rich service description; andIt exploits Semantic Web technologies as means to organize internally the data on available services.
It simplifies the service publishing process by removing the burden of any registration and brings service discovery even to non-technical persons.
Publishers increase their productivity, by being able to provide complex services without the need to register them explicitly.Creators become able to design more communicative forms of content by integrating third party services.Organizations can automate their processes by quickly finding adequate services.
Poznan, 29th April 2009 – © CEFRIEL 200923Irene Celino – From research to business: the Web of linked data
Exploitation ProspectsExploitation ProspectsThe results of the Service-Finder project have the potential to revolutionize this market and to outperform existing solutionsUsing Service Finder for Public services
Unique chancemarket for public services increases (xignite, cdyne, …)
Missing AlternativesUDDI (has been shutdown in 2006)Google (no reliable filter / no additional information)Portals (rely on editorial process <=400 services)
Service finder can also be applied within organizationsNumber of Services increases in organizationsAs within internet repositories in big companies can be quickly outdatedIT Manager like minimal invasive technology
Poznan, 29th April 2009 – © CEFRIEL 200924Irene Celino – From research to business: the Web of linked data
So what? ServiceSo what? Service--Finder and linked dataFinder and linked dataEven if I didn’t explicitly talk about linked data, that is exactly the result of Service-Finder
We take information about services from the Web, we translate it into structured information describing services wrt to domain-specific ontologies, we gives this information back to the community that can further enrich it
Is this linked data? Not yet, but:RDFa annotation in SF portal pages coming soonServices to query the knowledge base coming soonPossibly a “dump” of SF knowledge base could be easily published on the Web as linked data
From research to business: the Web of linked dataEnterprise X.0/Econom Workshops @ BIS 2009 – Poznan, 29th April 2009 - © CEFRIEL 2009
Urban Computing in LarKChttp://wiki.larkc.eu/UrbanComputing
There are lots of data sources about cities on the Web: how can we query and reason on it?
Poznan, 29th April 2009 – © CEFRIEL 200926Irene Celino – From research to business: the Web of linked data
Context: Cities are aliveContext: Cities are alive
Cities come to life, grow, evolve like living beingsThe state of a city changes continuously, influenced by a lot of factors
human factors: people moving in the city or extending itnatural factors: precipitations or climate changes
26NTT DoCoMo Invited speech, 11-3-2009
[source http://www.citysense.com]
Poznan, 29th April 2009 – © CEFRIEL 200927Irene Celino – From research to business: the Web of linked data
Today CitiesToday Cities’’ ChallengesChallengesOur cities face many challenges
• How can we redevelop existing neighbourhoods and business districts to improve the quality of life?
• How can we create more choices in housing, accommodating diverse lifestyles and all income levels?
• How can we reduce traffic congestion yet stay connected?• How can we include citizens in planning their communities
rather than limiting input to only those affected by the next project?
• How can we fund schools, bridges, roads, and clean water while meeting short-term costs of increased security?
• How can we redevelop existing neighbourhoods and business districts to improve the quality of life?
• How can we create more choices in housing, accommodating diverse lifestyles and all income levels?
• How can we reduce traffic congestion yet stay connected?• How can we include citizens in planning their communities
rather than limiting input to only those affected by the next project?
• How can we fund schools, bridges, roads, and clean water while meeting short-term costs of increased security?
[ source http://www.uli.org/]
Poznan, 29th April 2009 – © CEFRIEL 200928Irene Celino – From research to business: the Web of linked data
Urban Computing to address challengesUrban Computing to address challenges
Poznan, 29th April 2009 – © CEFRIEL 200929Irene Celino – From research to business: the Web of linked data
Urban ComputingUrban ComputingA definition:
The integration of computing, sensing, and actuation technologies into everyday urban settings and lifestyles.
Urban settings include, for example, streets, squares, pubs, shops, buses, and cafés - any space in the semipublic realms of our towns and citiesOnly in the last few years have researchers paid much attention to technologies in these spacesPervasive computing has largely been applied
either in relatively homogeneous rural areas, where researchers have added sensors in places such as forests, vineyards, and glaciersor, on the other hand, in small-scale, well-defined patches of the built environment such as smart houses or rooms
Urban settings are challenging for experimentation and deployment, and they remain little explored
[source IEEE Pervasive Computing,July-September 2007 (Vol. 6, No. 3)]
Poznan, 29th April 2009 – © CEFRIEL 200930Irene Celino – From research to business: the Web of linked data
Availability of DataAvailability of DataSome years ago, due to the lack of data, solving Urban Computing problems with ICT looked like a Sci-Fi ideaNowadays, a large amount of the required information can be made available on the Web at almost no cost. We are running a survey and we have collected more than 50 sources of data:
maps with streets and paths (Google Maps, Yahoo! Maps…),events scheduled (EVDB, Upcoming…),multimedia data with information about location (Flickr…)relevant places (schools, bus stops, airports...)traffic information (accidents, problems of public transportation...)city life (job ads, pollution, health care...)
We are running a survey (please contribute), seehttp://wiki.larkc.eu/UrbanComputing/ShowUsABetterWayhttp://wiki.larkc.eu/UrbanComputing/OtherDataSources
Poznan, 29th April 2009 – © CEFRIEL 200931Irene Celino – From research to business: the Web of linked data
Are Data Are Data MashupsMashups the solution?the solution?
[source: http://www-01.ibm.com/software/lotus/products/mashups/ ]
IBM Lotus Mashups
[source: http://editor.googlemashups.com ]
[source: http://pipes.yahoo.com/pipes/ ]
[source: http://www.popfly.com/ ]
[source: http://openkapow.com/ ]
Poznan, 29th April 2009 – © CEFRIEL 200932Irene Celino – From research to business: the Web of linked data
Data Data MashupsMashups offer powerful visualizationsoffer powerful visualizations
Google Charts API
http://code.google.com/apis/chart/http://maps.google.it/
http://maps.yahoo.com/
MIT Simile Timeline & Timeplot
http://simile.mit.edu/timeline/ http://simile.mit.edu/timeplot/
Poznan, 29th April 2009 – © CEFRIEL 200933Irene Celino – From research to business: the Web of linked data
Data Data MashupsMashups offer simple programming offer simple programming abstractionsabstractions
Poznan, 29th April 2009 – © CEFRIEL 200934Irene Celino – From research to business: the Web of linked data
Not everything boils down to plumbingNot everything boils down to plumbing
Poznan, 29th April 2009 – © CEFRIEL 200935Irene Celino – From research to business: the Web of linked data
The LarKC projectThe LarKC project
[Source: Fensel, D., van Harmelen, F.: Unifying reasoning and search to web scale. IEEE Internet Computing 11(2) (2007)]
Visit http://www.larkc.eu !Visit http://www.larkc.eu !
Poznan, 29th April 2009 – © CEFRIEL 200936Irene Celino – From research to business: the Web of linked data
Sustainable mobility as an exampleSustainable mobility as an exampleUrban Computing proposes a set of different issues, from technological to social ones. Our experience in the field make us believe that sustainable mobility is an exemplar case which we can elicit generalizablerequirements from.Mobility demand has been growing steadily for decades and it will continue in the future.For many years, the primary way of dealing with this increasing demand has been the increase of the roadway network capacity, by building new roads or adding new lanes to existing ones. However, financial and ecological considerations are posing increasingly severe constraints on this process. Hence, there is a need for additional intelligent approaches designed to meet the demand while more efficiently utilizing the existing infrastructure and resources.
• How can we redevelop existing neighbourhoods and business districts to improve the quality of life?
• How can we create more choices in housing, accommodating diverse lifestyles and all income levels?
• How can we reduce traffic congestion yet stay connected?
• How can we include citizens in planning their communities rather than limiting input to only those affected by the next project?
• How can we fund schools, bridges, roads, and clean water while meeting short-term costs of increased security?
• How can we redevelop existing neighbourhoods and business districts to improve the quality of life?
• How can we create more choices in housing, accommodating diverse lifestyles and all income levels?
• How can we reduce traffic congestion yet stay connected?
• How can we include citizens in planning their communities rather than limiting input to only those affected by the next project?
• How can we fund schools, bridges, roads, and clean water while meeting short-term costs of increased security?
Poznan, 29th April 2009 – © CEFRIEL 200937Irene Celino – From research to business: the Web of linked data
Actors:Carlo: a citizen living in Varese. The day after, he has to go to Lombardy Region premises in Milano at 11.00. UCS: a fictitious Urban Computing System of Milano area
Ways to MilanoPrivate CarFS railwaysLe Nord railways
A Challenging Use Case 1/2 (planning)A Challenging Use Case 1/2 (planning)
Varese
Milano
©2009 Google – Map Data @2009 Teleatlas – Terms of Usage
©2009 Google – Map Data @2009 Teleatlas – Terms of Usage
Poznan, 29th April 2009 – © CEFRIEL 200938Irene Celino – From research to business: the Web of linked data
Actors:Carlo: a citizen living in Varese. The day after, he has to go to Lombardy Region premises in Milano at 11.00. UCS: a fictitious Urban Computing System of Milano area
Ways to MilanoPrivate CarFS railwaysLe Nord railways
A Challenging Use Case 2/2 (traveling)A Challenging Use Case 2/2 (traveling)
Varese
Milano
M
M
©2009 Google – Map Data @2009 Teleatlas – Terms of Usage
©2009 Google – Map Data @2009 Teleatlas – Terms of Usage
Poznan, 29th April 2009 – © CEFRIEL 200939Irene Celino – From research to business: the Web of linked data
Requirements for LarKCRequirements for LarKCUrban Computing (and Mobility Management) encompass sensing, actuation and computing requirements.Many previous work in the area of Pervasive and Ubiquitous Computing investigated requirements in sensing, actuation, and several aspects of computation (from hardware to software, from networks to devices)In this work we are focusing on reasoning requirementsfor LarKC, but also of general interest for the entire community working on the complex relationship of the Internet with space, places, people and content.Hereafter we exemplify how coping with
representational, reasoning, and defaults heterogeneityscaletime-dependencynoisy, uncertain and inconsistent data
Poznan, 29th April 2009 – © CEFRIEL 200940Irene Celino – From research to business: the Web of linked data
Coping with representational heterogeneityCoping with representational heterogeneityIt is an obvious requirement
data always come in different formats (syntactic and structural heterogeneity)legacy data not in semantic formats will always exist!the problem of merging and aligning ontologies is a structural problem of knowledge engineering and it must be always considered when developing an application of semantic technologies.
Poznan, 29th April 2009 – © CEFRIEL 200941Irene Celino – From research to business: the Web of linked data
Coping with reasoning heterogeneityCoping with reasoning heterogeneityIt means the systems allow for multiple paradigms of reasoners; e.g.
precise and consistent inference for telling that at a given junction all vehicles, but public transportation ones, must go straight
approximate reasoning when calculating the probability of a traffic jam given the current traffic conditions and the past history
[ source http://senseable.mit.edu/ ]
Poznan, 29th April 2009 – © CEFRIEL 200942Irene Celino – From research to business: the Web of linked data
Coping with defaults heterogeneity 1/2Coping with defaults heterogeneity 1/2Open World Assumption vs. Close World Assumption
While for the an entire city we cannot assume complete knowledge, for a time table of a bus station we can
[source: http://gizmodo.com/photogallery/trafficsky/1003143552 ]
Poznan, 29th April 2009 – © CEFRIEL 200943Irene Celino – From research to business: the Web of linked data
Coping with defaults heterogeneity 2/2Coping with defaults heterogeneity 2/2Unique Name Assumption
A square with several station for buses and subway can be considered a unique point for multimodal travel planning, but not when the problem is giving direction in that square to a pedestrian
©2009 Google – Map Data @2009 Teleatlas – Terms of Usage ©2009 Google – Imagery @2009 Teleatlas – Terms of Usage
Poznan, 29th April 2009 – © CEFRIEL 200944Irene Celino – From research to business: the Web of linked data
Coping with scaleCoping with scaleThe advent of Pervasive Computing and Web 2.0 technologies led to a constantly growing amount of data about urban environmentsAlthough we encounter large scale data which are not manageable, it does not necessary mean that we have to deal with all of the data simultaneously.Usually, only very limited amount data are relevant for a single query/processing at a specific application. For example, when Carlo is driving to Milano,
only part of the Milano map data are relevant.the local parking information may become active by a prediction of the known relation between bad weather conditions and destination parking lot re-planning.
Poznan, 29th April 2009 – © CEFRIEL 200945Irene Celino – From research to business: the Web of linked data
Coping with timeCoping with time--dependencydependencyKnowledge and data can change over the time.
For instance, in Urban Computing names of streets, landmarks, kind of events, etc. change very slowly, whereas the number of cars that go through a traffic detector in five minutes changes very fast.
This means that the system must have the notion of ''observation period'', defined as the period when we the system is subject to querying.Moreover the system, within a given observation period, must consider the following four different types of knowledge and data:
Invariable knowledgeInvariable dataPeriodically changing data that change according to a temporal law that can beEvent driven changing data that are updated as a consequence of some external event.
Poznan, 29th April 2009 – © CEFRIEL 200946Irene Celino – From research to business: the Web of linked data
Invariable knowledge and dataInvariable knowledge and dataInvariable knowledge
it includes obvious terminological knowledge
such as an address is made up by a street name, a civic number, a city name and a ZIP code
less obvious nomological knowledge that describes how the world is expected
to be e.g., given traffic lights are switched off or certain streets are closed during the night
to evolve e.g., traffic jams appears more often when it rains or when important sport events take place
Invariable datado not change in the observation period, e.g. the names and lengths of the roads.
©2009 Google – Imagery @2009 Teleatlas – Terms of Usage
Poznan, 29th April 2009 – © CEFRIEL 200947Irene Celino – From research to business: the Web of linked data
Changing dataChanging dataPeriodically changing data change according to a temporal law that can be
Pure periodic law, e.g. every night at 10pm Milano overpasses close.Probabilistic law, e.g. traffic jam appear in the west side of Milano due to bad weather or when San Siro stadium hosts a soccer match.
Event driven changing data are updated as a consequence of some external event. They can be further characterized by the mean time between changes:
Slow, e.g. roads closed for scheduled worksMedium, e.g. roads closed for accidents or congestion due to trafficFast, e.g. the intensity of traffic for each street in a city
©2009 Google – Imagery @2009 Teleatlas – Terms of Usage
Poznan, 29th April 2009 – © CEFRIEL 200948Irene Celino – From research to business: the Web of linked data
Coping with noisy, uncertain and inconsistent dataCoping with noisy, uncertain and inconsistent dataTraffic data are a very good example of such data. Different sensors observing the same road area give apparently inconsistent information.
a traffic camera may say that the road is empty whereas an inductive loop traffic detector may tell 100 vehicles went over itThe two information may be coherent if one consider that a traffic camera transmits an image per second with a delay of 15-30 seconds, whereas a traffic detector tells the number of vehicles that went over it in 5 minutes and the information may arrive 5-10 minutes later.
Moreover, a single data coming from a sensor in a given moment may have no certain meaning.
an inductive loop traffic detector, it tells you 0 car went overIs the road empty? Is the traffic completely stuck? Did somebody park the car above the sensor? Is the sensor broken?
Combining multiple information from multiple sensors in a given time window can be the only reasonable way to reduce the uncertainty.
Poznan, 29th April 2009 – © CEFRIEL 200949Irene Celino – From research to business: the Web of linked data
Towards requirements satisfaction in LarKCTowards requirements satisfaction in LarKC
The Large Knowledge Collidera platform for infinitely scalable reasoning on the data-web
Pip
elin
e
Poznan, 29th April 2009 – © CEFRIEL 200950Irene Celino – From research to business: the Web of linked data
LarKC platform
Interface
Mobile Data Mashup Environment
SPARQLquery
SPARQLresult
RESTrequest
JSONresponse
Request data Data
PipelineConfig.
PROBLEM: Which Milano monuments or events or friends can I quickly get to from here?
TrafficMonumentsEventsPeople
The first Data The first Data MashupMashup withinwithin_________
http://www.larkc.eu
Poznan, 29th April 2009 – © CEFRIEL 200951Irene Celino – From research to business: the Web of linked data
A roadmap towards LarKC Urban use caseA roadmap towards LarKC Urban use caseData
Known: street topology, monuments/events/friends location, traffic situation (current data stream + historical time series)Inferred: traffic predictions, residual street capacity
Formulating the query for LarKCBasic: shortest path from A to BExtended: shortest path from A to monuments/events/friendsAdvanced: considering traffic predictions and residual street capacity
Configuring the pipelineBasic configuration
Combining a SPARQL processor and a Graph ProcessorUsing AllegroGraph GeoExtension as a selector
Extended configurationDBpedia, EVDB, GoogleLatitude selector
Advanced configuration: traffic predictions based on recurrent neural networks, residual street capacity based on data stream analysis
Poznan, 29th April 2009 – © CEFRIEL 200952Irene Celino – From research to business: the Web of linked data
LarKC Early Adopters WorkshopLarKC Early Adopters WorkshopThe public launch of the first open source LarKC platform release will take place at the forthcoming European Semantic Web Conference (ESWC 2009)
Register for the event! More information at: http://earlyadopters.larkc.eu/
We are developing the Urban Baby LarKC as a showcase of the potentiality of such platformEverybody will be invited to run experiments over LarKC
The Large Knowledge Collider a platform for massive distributed incomplete reasoninghttp://www.larkc.eu
From research to business: the Web of linked dataEnterprise X.0/Econom Workshops @ BIS 2009 – Poznan, 29th April 2009 - © CEFRIEL 2009
The next Web of open, linked data
Just research? What’s going on? Why should I care?
Poznan, 29th April 2009 – © CEFRIEL 200954Irene Celino – From research to business: the Web of linked data
“an open, shared database of the world’s information”
Source: Freebase - http://www.freebase.com (2009)
FreebaseFreebase
Poznan, 29th April 2009 – © CEFRIEL 200955Irene Celino – From research to business: the Web of linked data
OpenCalaisOpenCalais
Source: Thomson Reuters - http://www.opencalais.com/ (2009)
Poznan, 29th April 2009 – © CEFRIEL 200956Irene Celino – From research to business: the Web of linked data
WhatWhat’’s next? Business point of views next? Business point of viewOrganization today are used to produce lots of data……and they have the problem of managing and making sense of them!
More and more often they ask for Business Intelligence and related technologies to understand and decideBut it also happens that, in order to fully understand what’s going on and to take informed decisions, the data within the organization should be integrated or enhanced with external knowledge
This could definitely be a job for linked data technology!
Poznan, 29th April 2009 – © CEFRIEL 200957Irene Celino – From research to business: the Web of linked data
www.flickr.com/photos/_-amy-_/3167333250/
““Stop hugging Stop hugging your datayour data””
Sir Tim BernersSir Tim Berners--Lee, 2009Lee, 2009
Linked data seen by the Web inventorLinked data seen by the Web inventor
Don’t let considerations
about security or data ownership
represent an obstacle to
innovation and opportunities
Poznan, 29th April 2009 – © CEFRIEL 200958Irene Celino – From research to business: the Web of linked data
WhatWhat’’s next? Technological point of views next? Technological point of viewHow Business Intelligence and similar techniques change when their basic assumptions are no more valid?
Dynamically changing data sources (and data themselves…)Inconsistency typical of the Web (everything & the opposite of everything)Partial informationMore information than expected or than needed
Linked data pose new challenges for existing technologies!
Poznan, 29th April 2009 – © CEFRIEL 200959Irene Celino – From research to business: the Web of linked data
If I didnIf I didn’’t convince yout convince you……http://www.ted.com/index.php/talks/tim_berners_lee_on_the_next_web.html
From research to business: the Web of linked dataEnterprise X.0/Econom Workshops @ BIS 2009 – Poznan, 29th April 2009 - © CEFRIEL 2009
Thanks for your attention! Any question?
Contacts: Irene Celino – Semantic Web PracticeCEFRIEL – ICT Institute, Politecnico di Milanoemail: [email protected] – web: http://swa.cefriel.it
phone: +39-02-23954266 – fax: +39-02-23954466Slides available at: http://www.slideshare.net/iricelino