Semantische Technologien (nicht nur) für die verbesserte Suche in SharePoint
-
Upload
diqa-projektmanagement-gmbh -
Category
Technology
-
view
245 -
download
1
description
Transcript of Semantische Technologien (nicht nur) für die verbesserte Suche in SharePoint
DIQA Projektmanagement GmbH
Pfinztalstraße 90
76227 Karlsruhe
Semantische Technologien
(nicht nur) für die verbesserte
Suche in SharePoint
Daniel Hansch
Shared Solutions Day – 20. Februar 2014
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 2 DIQA Portfolio, January 2013
About DIQA GmbH
DIQA is an independent software vendor of knowledge management tools for ECM portals.
Our vision:
We provide our customers with services and products that turn their ECM
portals into smart portals by introducing semantic web technologies. Smart
portals let end-users better find, organize, process, control and govern
unstructured content.
Founded: 2012
Team: SharePoint, MediaWiki, knowledge management and semantic web specialists
Location: Germany, Karlsruhe
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 3 DIQA Portfolio, January 2013
Agenda
• The Semantic Web • Vision, Goals
• Principles
• Base technologies
• Available data
• Applications: • BBC Semantic Publishing
• Google Knowledge Graph
• Facebook Open Graph
• Wikidata
• Using the Semantic Web in SharePoint
• Semantic Search in SharePoint
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 4 DIQA Portfolio, January 2013
The Semantic Web
• Tim Berners-Lee’s vision of a semantic web: The Semantic Web isn't just about putting data on
the web. It is about making links, so that a person or
machine can explore the web of data. With linked
data, when you have some of it, you can find
other, related, data. http://www.w3.org/DesignIssues/LinkedData.html
• Note: We treat the terms as synonym:
• Semantic Web
• Web of Data
• Linked (Open) Data
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 5 DIQA Portfolio, January 2013
Linked Data Principles
★ Available on the web (whatever format)
… with an open license, to be Open Data
★★ Available as machine-readable structured data (e.g.
excel instead of image scan of a table)
★★★ Available in a non-proprietary format (e.g. CSV
instead of excel)
★★★★ Using open standards from W3C (RDF and SPARQL) to
identify things, so that people can point at your stuff
★★★★★ Linked to other people’s data to provide context
Tim Berners Lee (2010): http://www.w3.org/DesignIssues/LinkedData.html
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 6 DIQA Portfolio, January 2013
RDF Data Model
• Web of Data is based on
RDF data model
• RDF is a semi-structure graph data model
• Nodes and edges are
labeled with URIs
• Basic pattern (triple) • subject-predicate-object
• BusinessEntity1 offers Offering1
• UnitPriceSpec1 hasValue “200.0”
• RDF can be serialized in many formats, incl.
RDF/XML
http://www.heppnetz.de/projects/goodrelations/primer/images/fig1.png
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 7 DIQA Portfolio, January 2013
Linked Data Cloud 2007
Source for this and the folllowing graphs: Linking Open Data cloud: Richard Cyganiak, Anja Jentzsch
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 8 DIQA Portfolio, January 2013
Linked Data Cloud 2008
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 9 DIQA Portfolio, January 2013
Linked Data Cloud 2009
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 10 DIQA Portfolio, January 2013
Linked Data Cloud 2010
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 11 DIQA Portfolio, January 2013
Linked Data Cloud 2011
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 12 DIQA Portfolio, January 2013
Agenda
• The Semantic Web • Vision, Goals
• Principles
• Base technologies
• Available data
• Applications • BBC Semantic Publishing
• Google Knowledge Graph
• Facebook Open Graph
• Wikidata
• Using the Semantic Web in SharePoint
• Semantic Search in SharePoint
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 13 DIQA Portfolio, January 2013
Linked Data Cloud 2011
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 14 DIQA Portfolio, January 2013
BBC
Early adopter of the WoD („Linking open data project“), roles: • Data provider (program catalogue, artists)
• Data consumer (links to external resources about artists)
• Technology provider (similar to Thomson Reuters, Elsevier and NYT?)
Dynamic Semantic Publishing architecture • Semantic web technology stack to reduce curation effort for online media
production
• Challenge: BBC Sports sites for 2010 World cup, Olympic games: 700 index pages require curation, like links to story pages etc. and frequent updates.
• DSP replaces static publishing with dynamic aggregation that makes use of a metadata layer.
• Workflow:
• Editors author stories
• Stories are tagged (semi-)automatically
• Index pages are generated automatically and kept up-to-date through queries that use tags.
Benefit • Reduced effort for curation
• Deeper and broader access to BBC content
• Increased quality
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 15 DIQA Portfolio, January 2013
BBC Wildlife Portal
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 16 DIQA Portfolio, January 2013
BBC Wildlife Portal
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 17 DIQA Portfolio, January 2013
BBC Wildlife Portal
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 18 DIQA Portfolio, January 2013
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 19 DIQA Portfolio, January 2013
Agenda
• The Semantic Web • Vision, Goals
• Principles
• Base technologies
• Available data
• Applications • BBC Semantic Publishing
• Google Knowledge Graph
• Facebook Open Graph
• Wikidata
• Using the Semantic Web in SharePoint
• Semantic Search in SharePoint
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 20 DIQA Portfolio, January 2013
Google Knowledge Graph
• 2005 Google hires Guha (co-inventor of RSS and RDF)
• 2010 Google acquires Metaweb (developers of Freebase)
• 2011 Bing, Google and Yahoo! introduced Schema.org. • Goal: common set of schemas for structured data
markup on web pages • Based on ontologies and formal metadata
• Improve Search results
• 2012 Google starts enhancing search results with formal metadata from the Knowledge Graph • Based on wikipedia-crawls (~DBPedia)
• Freebase
• CIA World Factbook and more
• 2013 Google hires Denny Vrandecic (co-inventor of Semantic MediaWiki and Wikidata) …
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 21 DIQA Portfolio, January 2013
Google Knowledge Graph
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 22 DIQA Portfolio, January 2013
Facebook Open Graph
• Started as the Social Graph (friends)
• Now, every web-page/thing can become a node in the
Facebook Graph
• Social plugins on pages, e.g. Like
• Nodes can be linked with different kinds of edges
• Friend, Like, write, listen, eat, cook
• Graph API makes data readable and writable for Facebook
Apps
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 23 DIQA Portfolio, January 2013
Wikidata
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 24 DIQA Portfolio, January 2013
Wikidata in Wikipedia
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 25 DIQA Portfolio, January 2013
Agenda
• The Semantic Web • Vision, Goals
• Principles
• Base technologies
• Available data
• Applications • BBC Semantic Publishing
• Google Knowledge Graph
• Facebook Open Graph
• Wikidata
• Using the Semantic Web in SharePoint • Semantic Search in SharePoint
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 26 DIQA Portfolio, January 2013
Linked Data Cloud: Life Sciences Data
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 27 DIQA Portfolio, January 2013
Other Sources for data in Life Sciences
• From the LOD cloud
• UniProt
• SIDER
• DrugBank
• PubMed
• GeneOntology
• PubChem
• ChEMBL
• KEGG Drug, Pathway,
Enzyme, Reaction, …
• …
• LinkedLifeData combines
• ChemBI
• DiseaseSome
• DrugBank
• EntrezGene
• GeneOntology
• NCI
• SIDER
• PubMed
• UMLS
• Uniprot
• …
http://linkedlifedata.com/
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 28 DIQA Portfolio, January 2013
Use Linked Data from Uniprot to Filter SharePoint
Documents
Terms from Uniprot are used as
“Semantic Tags”. Each tags is associated
with an enzyme in Uniprot. This list of
documents is generated from a SPARQL-
query that returns all documents about
an enzyme, that has “Magnesium” as
cofactor.
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 29 DIQA Portfolio, January 2013
SharePoint add-on from DIQA: GRASP
1) Linking Open Data cloud: Richard Cyganiak, Anja Jentzsch
1)
SPARQL
GRASP accesses SPARQL
endpoints from the web of data.
SharePoint 2010
GRASP
GRASP Visualizations in Web Browser
Read more about GRASP: http://www.diqa-pm.com/en/GRASP
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 30 DIQA Portfolio, January 2013
Agenda
• The Semantic Web • Vision, Goals
• Principles
• Base technologies
• Available data
• Applications • BBC Semantic Publishing
• Google Knowledge Graph
• Facebook Open Graph
• Wikidata
• Using the Semantic Web in SharePoint
• Semantic Search in SharePoint: SharePoint Findability Solution
DIQA Projektmanagement GmbH
Pfinztalstraße 90
76227 Karlsruhe
DIQA‘S SHAREPOINT FINDABILITY SOLUTION • TERMINOLOGY MANAGEMENT • AUTOMATIC DOCUMENT CLASSIFICATION • INTELLIGENT SEARCH
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 32 DIQA Portfolio, January 2013
SharePoint Findability Solution: Features
1. Upload and manage terminologies in the “library of
ontologies” (e.g. SKOS and TBX/TermBase eXchange).
2. Load terminologies into term stores, groups or term sets.
3. Manage the terms in the terminology manager (e.g.
labels in different languages).
4. Manage the relations between terms including
associations and poly-hierarchies.
5. Create classification rules in order to automatically tag the
document corpus (requires Layer2 Autotagger).
6. Use the terminology to intelligently suggest search terms in the document search (Term Suggester).
7. Use the TreeView Refiner to drill-down or drill-up in the
search results.
8. The user is guided in the search process by the „Matching Terms“ and „Related Terms“ webparts.
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 33 DIQA Portfolio, January 2013
Upload
terminologies (in
SKOS or TBX) and
manage them in
a library.
http://server/
1. Library of ontologies
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 34 DIQA Portfolio, January 2013
http://server/
2. Select the term
store and the
update strategy.
1. Select a
terminology or
taxonomy to
populate a term
store…
2. Load terminologies into the termstore
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 35 DIQA Portfolio, January 2013
Manage term
labels in different
languages,
descriptions, …
3. Manage terms
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 36 DIQA Portfolio, January 2013
4. Manage relations between terms
Add terms that
are related to this
term…
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 37 DIQA Portfolio, January 2013
Manage multiple
parent terms (poly
hierarchy)…
4. Manage relations between terms
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 38 DIQA Portfolio, January 2013
…pick parent
terms from the
tree browser.
4. Manage relations between terms
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 39 DIQA Portfolio, January 2013
Inspect the full
term hierarchy in
the TreeBrowser.
4. Manage relations between terms
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 40 DIQA Portfolio, January 2013
5. Define classification rules
If a document
satisfies this rule
then it is tagged
with a specific
term.
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 41 DIQA Portfolio, January 2013
Validate the rule
before it is used to
analyze your
entire document
corpus.
5. Define classification rules
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 42 DIQA Portfolio, January 2013
5. Tag documents automatically
Entire SharePoint
content is tagged
automatically
based on the
classification rules.
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 43 DIQA Portfolio, January 2013
6. Search terms are intelligently suggested
The Term Suggester
Webpart supports
the user while he is
typing in his search
query…
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 44 DIQA Portfolio, January 2013
6. Search terms are intelligently suggested
…the intelligent
matching algorithm
suggests terms from
the terminology that
contain parts of the
search query in
labels and
synonyms.
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 45 DIQA Portfolio, January 2013
7. Term-tree to navigate in search results
TreeView Refiner
Webpart extends
the standard refiner
webpart and
visualises the terms
in the context of the
term-tree.
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 46 DIQA Portfolio, January 2013
7. Term-tree to navigate in search results
Users can select
terms in the term-
tree to drill down or
drill up in the search
results.
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 47 DIQA Portfolio, January 2013
7. Term-tree to navigate in search results
Search results
are updated
as you
navigate in the
term tree.
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 48 DIQA Portfolio, January 2013
8. Matching terms guide the user in the search process
Pick a new search
term from the list of
matching terms
and resume the
search.
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 49 DIQA Portfolio, January 2013
Advantage over standard SharePoint-Search
1. Superior managed metadata for content classification
2. Integrated taxonomies from various sources
3. Reliable automatic document-tagging
4. Users find documents immediately despite unknown
taxonomy
5. Users are guided in the search process
6. The terms contained in the search results are presented in
their taxonomic context
7. Users can easily drill-up or drill-down in the tree to broaden
or narrow the search
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 50 DIQA Portfolio, January 2013
Get Started Now!
http://diqa-pm.com/en/Stop_searching_start_finding
Contact DIQA:
mail: [email protected]
phone: +49 (0) 721 609 517 25
© 2013 DIQA Projektmanagement GmbH | www.diqa-pm.com | Slide 51 DIQA Portfolio, January 2013
Take Home Message
• Semantic Web
• Open standards for publishing structure data
(graph knowledge)
• Vast number of available data sources
• DIQA makes this knowledge accessible in
SharePoint
• Metadata is one key benefit of SharePoint
Stop searching, start finding: the "SharePoint
Findability" solution from DIQA provides reliable
products and a proven method to find
documents quicker and more efficiently.
DIQA Projektmanagement GmbH Pfinztalstraße 90 76227 Karlsruhe
Visit us on http://www.diqa-pm.com Thank you for your attention!