Machine Learning and Knowledge Discovery for Semantic...
Transcript of Machine Learning and Knowledge Discovery for Semantic...
![Page 1: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/1.jpg)
ailab.ijs.si
Machine Learning and Knowledge Discovery for Semantic Web
Dunja MladenićArtificial Intelligence Laboratory,
J. Stefan Institute,
Slovenia
![Page 2: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/2.jpg)
ailab.ijs.si
Jožef Stefan Institute, Artificial Intelligence Laboratory
Selection of FP6 & FP7 Projects (Integrated Projects and Networks of Excellence only):
FP7 IP ACTIVE – Enabling the Knowledge Powered Enterprise
FP7 IP COIN – COllaboration and INteroperability for networked enterprises
FP7 IP EURIDICE – Inter-Disciplinary Research on Intelligent Cargo for Efficient, Safe and Environment-friendly Logistics
FP7 NoE PASCAL2 – Pattern Analysis, Statistical Modeling and Computational Learning
FP7 NoE MetaNet – Machine Translation & Multilingual Information Retrieval
FP7 NoE Multilingual Web
FP6 IP NeOn – Lifecycle Support for Networked Ontologies
FP6 IP ECOLEAD – European Collaborative Networked Organizations Leadership Initiative
FP6 IP SEKT – Semantically-Enabled Knowledge Technologies
Jozef Stefan Institute (JSI) is the leading Slovene research institution for natural sciences (900+ people)
in the areas of computer science, physics, chemistry
Artificial Intelligence Laboratory has over 30 people working in various areas of artificial intelligence(machine learning, data mining, semantic technologies, computational linguistics, logic)
Spinoff-s: Quintlligence, Cyc-Europe, LiveNetLife, ModroOko, Envigence
Selection of Portals and Products:
Text-Garden (http://www.textmining.net)
Enrycher (http://enrycher.ijs.si/)
VideoLectures.NET (http://videolectures.net/)
IST-World (http://www.ist-world.org/)
Project Intelligence (http://pi.ijs.si/)
Search-Point (http://searchpoint.ijs.si/)
OntoGen (http://ontogen.ijs.si/)
Document-Atlas (http://docatlas.ijs.si/)
AnswerArt (http://answerart.net/)
Contextify (http://contextify.net/)
Document-Atlas
VideoLectures.NET
Business Clients: Accenture Labs, Bloomberg, British Telecom, Google Labs, Microsoft Research, New York Times, Siemens, Wikipedia
Academic Partners: Carnegie Mellon, Cornel, Stanford, MIT, Uni. Maryland, KIT, UCL
Enrycher IST-WorldSearchPoint
OntoGen AnswerArt Contextify e-mails
![Page 3: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/3.jpg)
ailab.ijs.si
AILabTechnologies
Graph/Social Network Analysis
(GraphGarden/SNAP, IST-World,
FPIntelligence)
Complex Data Visualization
(DocAtlas, NewsExplorer, SearchPoint)
Computational Linguistics
(Enrycher, AnswerArt)
Social Computing/Web2.0 (LiveNetLife)
Light-Weight Semantic Technologies
(OntoGen, Contextify)
Deep Semantics & Reasoning (Cyc)
Statistical Machine Learning
Data/Web/Text/Stream-Mining
(TextGarden Suite of tools)
![Page 4: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/4.jpg)
ailab.ijs.si
Outline
Motivation
Machine Learning and Ontologies
OntoGen
OntoPlus
Semantics for search and browsing
SearchPoint
AnswerArt
Enrycher
Sensor Search
Real-time data processing
NYTMiner, BBMiner, Personalized News Search
…to conclude
![Page 5: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/5.jpg)
ailab.ijs.si
Motivation
Semantic Web
integrates many existing ideas and technologies focusing on
upgrading the existing nature of web-based information
systems to a more “semantic” oriented nature
typical approach is top-down modeling of knowledge and
proceeding down towards the data
Machine Learning and Knowledge Discovery in
Databases
aims at data modeling and extraction of interesting (non-
trivial, implicit, previously unknown and potentially useful)
information from large datasets
data-driven bottom-up approach trying to discover the
structure in the data and express it in the more abstract ways
and rich knowledge formalisms
![Page 6: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/6.jpg)
ailab.ijs.si
ML & KDD role within Semantic WebOntology construction
SW applications involve deep structured knowledge composed into ontologies
ML/KDD discovering structure in the data - structuring knowledge
semi-automatically extract knowledge from data into ontological structure
Integrating domain knowledgeML/KDD approaches, e.g., “Active Learning” and “Semi-supervised Learning” make use of small pieces of human knowledge for better guidance towards the desired model (e.g., ontology)
reduce human efforts by an order of magnitude preserving the quality of results
Handling data over time - dynamic ontologiesdata and the corresponding semantic structures change in time
KDD technologies for stream mining - deal with the stream of incoming data fast enough to be up-to-date with the corresponding models (ontologies)
Supporting different data modalitiesML/KDD technologies are not limited to a specific data representation -handling different data modalities (databases, text, multimedia, graphs)
ML/KDD for Language Technologies SW mainly deals with textual data, LT are thus important for SW including lexical, syntactical and semantic levels of natural language processing
ML/KDD for modeling natural language by automatic learning from rare/costly data
Scalability KDD approaches consider scalability
SW is ultimately concerned with real-life data on the web which have exponential growth
![Page 7: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/7.jpg)
ailab.ijs.si
Ontology - SW commonly uses ontologies to structure knowledge
Ontology can be seen as a graph/network
structure consisting from:
a set of concepts (vertices in a graph),
a set of relationships connecting concepts
(directed edges in a graph),
a set of instances assigned to a particular
concepts (data records assigned to vertices in
a graph)
![Page 8: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/8.jpg)
ailab.ijs.si
Ontology construction
One of the methodologies defined for ontology construction is a methodology for semi-automatic ontology constructionanalogous to the CRISP-DM methodology can be defined as consisting of the following interrelated phases:
1. domain understanding (what is the area we are dealing with?),
2. data understanding (what is the available data and its relation to semi-automatic ontology construction?),
3. task definition (based on the available data and its properties, define task(s) to be addressed),
4. ontology learning (semi-automated process addressing the task(s)
5. ontology evaluation (estimate quality of the solutions to the addressed task(s)),
6. refinement with human in the loop (perform any transformation needed to improve the ontology and return to any of the previous steps, as desired)
[Grobelnik & Mladenić 2006]
![Page 9: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/9.jpg)
ailab.ijs.si
ML/KDD for ontology learning
Define the ontology learning tasks in terms of mappings between ontology components, where some of the components are given and some are missing and we want to induce the missing ones.
Some typical scenarios in ontology learning are the following:
Inducing concepts/clustering of instances (given instances)
Inducing relations (given concepts and the associated instances)
Ontology population (given an ontology and relevant, but not associated instances)
Ontology generation (given instances and any other background information)
Ontology updating/extending (given an ontology and background information, such as, new instances or the ontology usage patterns)
![Page 10: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/10.jpg)
ailab.ijs.si
Ontology Population via document classification into topic ontology
Goal: given a collection of documents organized into a topic ontology, classify a new document into the ontology
Different classification algorithms were applied on different data representations (e.g., word-vectors, word n-gram vectors, flexible phrase vectors)
on different datasets (e.g., Yahoo! directory of Web pages, US patent database, Directory of Slovenia/Croatian Web pages, News directory)
![Page 11: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/11.jpg)
ailab.ijs.si
OntoClassify
System for scalable classification of text into large
topic ontologies [Grobelnik & Mladenić, 2005]
Available as Web service
for DMoz directory of Web pages
for Inspec ontology for annotating papers
for Mesh medical ontology
![Page 12: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/12.jpg)
ailab.ijs.si
Constructing ontology from data stream
Goal: given a stream of documents (e.g., news
arriving over time) construct ontology
Solution: Framework that incorporates the stream
mining process into a formal definition of ontology[Grobelnik et al., 2006]
Extract named entities and use them as instances of the ontology
Entities and co-occurring entity pairs are represented by feature
vectors based on the content of the documents they occur in
Concepts and relations can be formed either by clustering or by
classification into an existing topic hierarchy
![Page 13: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/13.jpg)
ailab.ijs.si
Illustrative results on Reuters news
Observe change in relations between entities
over time, e.g.,
France – UK relation focused first on
Society (Society, Government, Regional,...) and later
moves to
Business (Investing, Business, Stocks, Bonds,…);
![Page 14: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/14.jpg)
ailab.ijs.si
Ontology Learning from text
Extending the existing ontologycommonly used is the English lexical ontology WordNet that is extended using some text, eg., Web documents [Agirre et al., 2000]
Learning relations for an existing ontology (from docs)learn relations between the concepts (eg., “isa” [Cimiano et al., 2004], “hasPart” [Maedche, Staab, 2001]), extract semantic relations from text based on collocations [Heyer et al., 2001]
Ontology construction based on clustering (from docs)split each document into sentences, parse the text and apply clustering for semi-automatic construction of an ontology [Bisson et al., 2000; Reinberger et al., 2004]
cluster sentences map them upon the concepts of a general ontology (eg., Wordnet [Hotho et al., 2003])
use whole documents and guiding the user through a semi-automatic process of ontology construction [Fortuna et al., 2005]
![Page 15: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/15.jpg)
ailab.ijs.si
Ontology Learning from text (cont)
Ontology construction based on semantic graphsparse the documents and construct semantic graphs, use it for learning document summaries [Leskovec et al., 2004]
Ontology construction from a collection of news stories
represent news as graphs of named entities with relationships based on collocations, used for visualization/browsing [Grobelnik, Mladenić, 2004]
More information in edited book [Buitelaar et al., 2005]
![Page 16: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/16.jpg)
ailab.ijs.si
SEMI-AUTOMATIC DATA-DRIVEN ONTOLOGY CONSTRUCTION
Blaz Fortuna, Dunja Mladenić, Marko Grobelnik
http://ontogen.ijs.si
![Page 17: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/17.jpg)
ailab.ijs.si
Ontology Learning with OntoGen
Semi-Automaticprovide suggestions and insights into the domain
the user interacts with parameters of methods
final decisions taken by the user
Data-Drivenmost of the aid provided by the system is based on some underlying data
instances are described by features extracted from the data (eg., words-vectors)
Installation package available at ontogen.ijs.si
![Page 18: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/18.jpg)
ailab.ijs.si
Main Features
Interactive user interface
User can interact in real-
time with the integrated
machine learning and text
mining methods
Concept discovery
methods:
Unsupervised
k-means clustering
Latent Semantic
Indexing (LSI)
Supervised
Active learning
Concept visualization
Methods for helping at
understanding the
discovered concepts:
Keyword extraction
TFIDF and SVM-normal
based keyword extraction
Concept visualization
LSI and multi-dimensional
scaling based visualization
Also available as a separate
tool named Document
Atlas:http://docatlas.ijs.si
![Page 19: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/19.jpg)
ailab.ijs.si
Ontology management
Concept hierarchy
List of suggested sub-concepts
Ontology visualization
Selected concept
![Page 20: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/20.jpg)
ailab.ijs.si
Concept management
Concept’s details
Concept’s instance
management
Selected concept
Keywords
Selected instance
![Page 21: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/21.jpg)
ailab.ijs.si
Active Learning for concept learning
SVM hyperplane distance based active learning algorithm
First few labelled documents are bootstrapped from a query search
Instances for final concept are selected using the final SVM model
Query
New Concept
![Page 22: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/22.jpg)
ailab.ijs.si
Reuters news articles used in the upper example with two different
sets of categories: topics or list of countries that appear in the news
articles.
Each set of categories offers a different view on the data.
SVM based method detects importance of keywords for each view.
Multiple views of the same data
Topics
view
Countries
view
UK takeovers and mergers
The following are additions
and deletions to the
takeovers and mergers list
for the week beginning
August 19, as provided by
the Takeover …
Lloyd’s CEO questioned in
recovery suit in U.S.
Ronald Sandler, chief
executive of Lloyd's of
London, on Tuesday
underwent a second day of
court interrogation about …
![Page 23: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/23.jpg)
ailab.ijs.si
Instances are visualized as points on 2D map. The distance between two
instances on the map correspond to their similarity.
Characteristic keywords are shown for all parts of the map.
User can select groups of instances on the map to create sub-concepts.
Concept’s instances visualization
![Page 24: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/24.jpg)
ailab.ijs.si
New documents
Classification of selected document
Selected document
Ontology population
System uses one vs. all linear SVM trained on created ontology to classify new instances into concepts.
Users can finalize the classifications using an interactive user interface
![Page 25: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/25.jpg)
ailab.ijs.si
ONTOGEN ON IMAGES
Nenad Tomašev, Blaz Fortuna, Dunja Mladenić, Marko Grobelnik
![Page 26: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/26.jpg)
ailab.ijs.si
SIFT features
Color
info
Text
Extract
features
Data
Mining
Application
Image representation
![Page 27: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/27.jpg)
ailab.ijs.si
Image representation - features
SIFT features
Rotation, scale and translation invariant orientation
gradients located at “interesting” points on an image
Usually, SIFT feature space is quantized to get
“representative” vectors (“codebook” histogram)
Color histogram
Simply divide the color spectrum into “buckets” and
calculate the distribution of colors into these buckets,
(color histogram)
Distance - weighted sum of SIFT codebook and color data
distances
![Page 28: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/28.jpg)
ailab.ijs.si
OntoGen on ImageNet subset (flowers, fire, buildings)
![Page 29: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/29.jpg)
ailab.ijs.si
Document list for quick overview
![Page 30: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/30.jpg)
ailab.ijs.si
Collection visualization (without displaying images)
![Page 31: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/31.jpg)
ailab.ijs.si
Collection visualization(displaying images)
![Page 32: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/32.jpg)
ailab.ijs.si
Creating ontology on images
Grouping similar images - concepts
Displaying relevant features as concept names
![Page 33: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/33.jpg)
ailab.ijs.si
Sub-concept visualization
flower
buildings
fire
![Page 34: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/34.jpg)
ailab.ijs.si
Adding sub-concepts
![Page 35: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/35.jpg)
ailab.ijs.si
TEXT-DRIVEN ONTOLOGYEXTENSION
Inna Novalija, Dunja Mladenić
![Page 36: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/36.jpg)
ailab.ijs.si
Arc
hit
ectu
re
OntoPlus
OntoPlus methodology
allows for the effective
extension of the very large
ontologies.
OntoPlus methodology
provides the user with
required concepts and
relationships in the form
of the ranked list.
OntoPlus methodology
combines textual ontology
content, ontology structure
and co-occurrence
information.
Domain Subset Extraction Module (DSEM)
Ontology Extension
Module (OEM)
3
4
5
Ontology Extender
Validated Entries:
Glossary Term,
Ontology Concept,
Relation
Candidate Entries:
for Each Glossary Term -
Ranked List of Related
Ontology Concept s and
Correspondent Relations
Suggested
Domain
Knowledge
Extractor
Extraction of
ontology concepts
defined in relevant
domains
Extraction of ontology
concepts with denotation
similar to Glossary Term
names
Extraction of
relevant domains
2 Relevant
Ontology
SubsetUpper-Level
Domain
Extractor
6
Multi-Domain
Ontology
7
Domain KB
Domain Information Module (DIM)
Domain
Keywords
Domain Glossary:
Term Names;
Term Descriptions
1
Domain information
identification
Extraction of the
domain relevant
ontology subsetRelated concepts
extraction
User validation
Ontology reuse
![Page 37: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/37.jpg)
ailab.ijs.si
OntoPlus
Text-Driven Ontology Extension Using Ontology Content,
Structure and Co-occurrence
Ranking existing ontology concepts as corresponding to a new
domain concept suggested for the ontology extension
Experiments using Cyc ontology and textual material from two
domains – Finances and, Fisheries & Aquaculture
Best results by combining content, structure and co-occurrence
information
Financial domain - ontology content and structure
Fisheries & Aquaculture domain - ontology content and co-
occurrence
![Page 38: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/38.jpg)
ailab.ijs.si
Results – Concept Ranking
100 Random Terms
HR (Top 1) HR (Top 5) HR (Top 10)
Weighting Measure Eqv or Hier
Rels
Any
Rels
Eqv or Hier
Rels
Any Rels Eqv or Hier
Rels
Any Rels
Baseline - Name: [1.0] 18 28 24 36 25 40
Content (cos. similarity): [1.0] 32 65 60 92 68 95
Co-occur (Jaccard similarity): [1.0] 30 48 48 62 52 73
Content: [0.5]
Structure: [0.4]
Co-occur: [0.1]
38 68 66 95 76 98
100 Random Terms
HR (Top 1) HR (Top 5) HR (Top 10)
Weighting Measure Eqv or Hier
Rels
Any Rels Eqv or Hier
Rels
Any Rels Eqv or Hier
Rels
Any Rels
Baseline - Name: [1.0] 24 37 25 38 27 40
Content (cos. similarity): [1.0] 32 72 52 88 56 91
Co-occur (Jaccard similarity): [1.0] 33 71 49 89 51 90
Content: [0.5]
Structure: [0.0]
Co-occur: [0.5]
42 84 63 96 66 96
Evaluation of the top suggested candidate concepts for ontology extension
(ASFA thesaurus)
Evaluation of the top suggested candidate concepts for ontology extension
(Financial glossary)
String edit distance of
concept name
Content +
Co-occurrence
Content +
Structure +
Co-occurrence
String edit distance of
concept name
![Page 39: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/39.jpg)
ailab.ijs.si
Demo
![Page 40: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/40.jpg)
ailab.ijs.si
CONTEXT SENSITIVE SEARCH
Boštjan Pajntar, Marko Grobelnik, Dunja Mladenić
http://SearchPoint.ijs.si
![Page 41: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/41.jpg)
ailab.ijs.si
SearchPoint
Search engines generally work very well
There are cases where it is difficult to specify aquery
Idea: help the user by clustering all the hits and visualise the results space
Some related work: mindset.research.yahoo.com – research vs. shopping aspect
www.ujiko.com – clustering & user interface
vivisimo.com – hierarchical clustering
![Page 42: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/42.jpg)
ailab.ijs.si
Approach Description
Search results clustered and shown in 2D space
Each point in this cluster space coresponds to a ranking
Hits are ordered according to the position of the focus -
the selected point
Initial focus position corresponds to Google ranking
Positioning clusters with respect to centroid to centroid
similarity
Calculating ranking of document using its similarity to each
centroid:
Classifiying documents into web directory (DMoz),
visualising relevant parts of the directory
![Page 43: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/43.jpg)
ailab.ijs.si
Search
“Internet search” – one of the
most common tasks involving
text manipulation in everyday
life
…but – how smart is search
technology today?
…not too smart!
It is sophisticated, but not smart
![Page 44: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/44.jpg)
ailab.ijs.si
Example – Searching for “jaguar”
Query “jaguar” has many meanings…
…but the first page of search engines doesn’t provide us with many answers
…there are 84M more results
![Page 45: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/45.jpg)
ailab.ijs.si
Query
Conceptual map
Search Point
Dynamic
contextual
ranking based
on the search
point
Context sensitive search
![Page 46: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/46.jpg)
ailab.ijs.si
SearchPoint
![Page 47: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/47.jpg)
ailab.ijs.si
SearchPoint
![Page 48: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/48.jpg)
ailab.ijs.si
Main advantages
Generated clusters
(in contrast to predefined)
User can search the whole cluster space and is
not forced to select a single cluster
(Computer generated clusters are not necessarily
what user has in mind)
![Page 49: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/49.jpg)
ailab.ijs.si
SearchPoint integrated in Accenture’s intranet search
![Page 50: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/50.jpg)
ailab.ijs.si
ANSWER ART
Luka Bradeško, Lorand Dali, Blaž Fortuna, Marko Grobelnik, Dunja
Mladenić, Inna Novalija, Boštjan Pajntar
http://AnswerArt.net
![Page 51: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/51.jpg)
ailab.ijs.si
TripletsExtendedontology
AnswerArt – System Architecture
AnswerArtpreprocessing
Domain ontology(ASFA, WordNet)
Semantic enhancement
of triplets
AnswerArt
Index
Extraction
Cyc
Question Answer
![Page 52: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/52.jpg)
ailab.ijs.si
AnswerArt using Medline
![Page 53: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/53.jpg)
ailab.ijs.si
Show
document
AnswerArt using Medline
![Page 54: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/54.jpg)
ailab.ijs.si
Show document
overview
![Page 55: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/55.jpg)
ailab.ijs.si
![Page 56: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/56.jpg)
ailab.ijs.si
AnswerArt using ASFA
![Page 57: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/57.jpg)
ailab.ijs.si
AnswerArt using ASFA
Show
document
![Page 58: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/58.jpg)
ailab.ijs.si
AnswerArt using ASFA
Show document
overview
![Page 59: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/59.jpg)
ailab.ijs.si
NATURAL LANGUAGE TEXTENRICHMENT
Tadej Štajner, Delia Rusu, Lorand Dali, Blaž Fortuna,
Dunja Mladenić, Marko Grobelnik
http://enrycher.ijs.si
![Page 60: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/60.jpg)
ailab.ijs.si
Enrycher Service
Annotation Features:
Entity extraction
People, locations, organizations,
dates, percentages and money
amounts
Entity resolution
co-reference
anaphora
Entity linkage to Linked Open
Data (LOD)
Word Sense Disambiguation to
LOD (WordNet 3.0 VUA)
Assertion extraction
Subject – predicate – object sentence
elements together with their modifiers
Categories – from the Open
Directory and the Wikipedia category
schema
![Page 61: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/61.jpg)
ailab.ijs.si
Entity resolution in text
![Page 62: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/62.jpg)
ailab.ijs.si
Enrycher Service Dependencies
The dashed line marks dependencies between components that are optional,
whereas the filled lines mark required dependencies
![Page 63: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/63.jpg)
ailab.ijs.si
A comparative view on five systems: Enrycher, Text Runner, Open Calais, GATE and Read the
Web
Features Enrycher Text Runner Open Calais GATE NELL
Named Entity Extraction
Co-reference and
Anaphora Resolution
Entity resolution
Disambiguation
Assertion Extraction Relationshipextraction
Events andFacts
Relationshipextraction
Categories
Vizualization
RDF Output
Multi-Language Support English English,
French,Spanish
Web Service API
Can work on a singledocument
![Page 64: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/64.jpg)
ailab.ijs.si
Enrycher - demo
![Page 65: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/65.jpg)
ailab.ijs.si
Enrycher - demo
![Page 66: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/66.jpg)
ailab.ijs.si
Enrycher - demo
Entities
Semantic graph
![Page 67: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/67.jpg)
ailab.ijs.si
Enrycher - demo
Entity details
In OpenCyc
Category
![Page 68: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/68.jpg)
ailab.ijs.si
OPINION MINING
Andreea Bizău, Delia Rusu, Dunja Mladenić
![Page 69: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/69.jpg)
ailab.ijs.si
Opinion MiningUse case: Twitter comments on movies
amazing,
awesome
Weird,
odd
Weird, odd,
bad
amazing,
awesome,
perfect,
fantastic
IMDb Movie reviews*
(sample)
IMDb Movie reviews*
(Training data)
Domain-specific
opinion vocabulary
2 Clusters
Vocabulary
* http://www.cs.cornell.edu/people/pabo/movie-review-data/
applied to
Twitter comments analysis
Movie tweets
(Test data)
![Page 70: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/70.jpg)
ailab.ijs.si
Twitter comments
analysis
• Sentiment words
distribution for a
movie
• Sentiment orientation
evolution per week,
day, hour
• Movie comparison
![Page 71: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/71.jpg)
ailab.ijs.si
SENSOR SEARCH
Lorand Dali, Alexandra Moraru, Dunja Mladenić
![Page 72: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/72.jpg)
ailab.ijs.si
Sensor Search - Architecture
Sensor Descriptions
(Text)Inverted Index
Ranking Model
(Personalized PageRank)
Geo Filtering
S
E
A
R
C
H
E
N
G
I
N
E
Query
• keywords
• center of area
of interest
• radius of area
of interest
![Page 73: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/73.jpg)
ailab.ijs.si
![Page 74: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/74.jpg)
ailab.ijs.si
REAL-TIME INFORMATION PROCESSING
Blaz Fortuna, Dunja Mladenić, Marko Grobelnik
![Page 75: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/75.jpg)
ailab.ijs.si
Generic platform running on clouds for intensive data stream analytics…processes thousands of events per second
…includes state of the art data/text/web/stream-mining algorithms
Deployed in British Telecom, NYTimes, Bloomberg, Microsoft, TheStreet.com,
… ongoing work with Google News, Telefonica, Wikipedia,
QMiner – generic software platform for Real-Time information processing &
Complex Event Detection & Anomaly Detection
Transform&
Enrich
Anomaly
detection
Complex
events
detection
Analytics: Prediction,
Segment, Visualization
Model
CaptureReality
(Events)
Sensors,
Alarms,
User logs,
…
![Page 76: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/76.jpg)
ailab.ijs.si
Network Monitoring for British Telecom
Alarms Server
Alarms
Explorer
Server
Live feed of
data
Operator Big board display
British
Telecom
Network
(~25 000
devices)
Alarms~10-100/sec
Alarms Explorer Server implements three real-
time scenarios on the alarms stream:
1. Root-Cause-Analysis – finding which device is
responsible for occasional “flood” of alarms
2. Short-Term Fault Prediction – predict which
device will fail in next 15mins
3. Long-Term Anomaly Detection – detect unusual
trends in the network
…system is used in British Telecom
![Page 77: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/77.jpg)
ailab.ijs.si
VisualizingRoot-cause
and prediction
Root-
cause
Prediction
![Page 78: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/78.jpg)
ailab.ijs.si
How Well Are We Predicting
Percentage Realisation of Predictions
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
90.00%
Minutes
Pe
rce
nta
ge
86%
80%
60%
![Page 79: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/79.jpg)
ailab.ijs.si
User Modeling for NYTimes & Bloomberg
Log Files
(~100M
page clicks
per day)
User
profiles
NYT
articles
Stream of
profiles
Advertisers
Segment Keywords
Stock Market Stock Market, mortgage, banking,
investors, Wall Street, turmoil, New York
Stock Exchange
Health diabetes, heart disease, disease, heart,
illness
Green
Energy
Hybrid cars, energy, power, model,
carbonated, fuel, bulbs,
Hybrid cars Hybrid cars, vehicles, model, engines,
diesel
Travel travel, wine, opening, tickets, hotel, sites,
cars, search, restaurant
… …
Segments
Trend Detection System
Stream of
clicks
Trends and
updated segments
Campaign
to sell
segments
$
Sales
![Page 80: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/80.jpg)
ailab.ijs.si
Generalizing from registered users
BEP for Age (20% = random)
50,0%
55,0%
60,0%
65,0%
70,0%
75,0%
Conte
xt
Text F
eatu
res
Nam
ed E
ntities
All
Me
ta D
ata
All
Conte
nt
All
Fe
atu
res
Male
Female
BEP for Gender on users with at
least 10 visits (50% = random)
20,00%
25,00%
30,00%
35,00%
40,00%
45,00%
≥2
≥10
≥50
![Page 81: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/81.jpg)
ailab.ijs.si
Good recommendations
can make a big difference
when keeping a user on a
web site
…the key is how rich context
model a system is using to
select information for a user
Bad recommendations <1%
users, good ones >5% users
click
Using User Modeling for News Recommendations
Contextual
personalized
recommendations
generated in ~20ms
![Page 82: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/82.jpg)
ailab.ijs.si
RecommendationFeatures:
History (user profile)
Geo (based on IP)
Requested page (where we serve recommendation)
Referring URL
Time
timenow
US
Finance
Oil
All History Context Geo Requested Referring Time
Top1 Recall 66 65 65 65 66 60 60
Top2 Recall 81 78 78 75 78 67 67
Top3 Recall 86 83 83 79 81 72 72
Top Precision 52 48 49 43 41 36 36
Regular (visits > 50)
Context Geo Requested Referring Time
Top1 Recall 60 58 46 60 60
Top2 Recall 77 70 61 71 71
Top3 Recall 85 77 72 78 78
Top Precision 45 36 35 37 37
New (first visit)
training
![Page 83: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/83.jpg)
ailab.ijs.si
Real-time Architecture
Logging
Collaborative Filter
SVM
Archive
Web
Amazon
Crawl
![Page 84: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/84.jpg)
ailab.ijs.si
Results
0,0%
1,0%
2,0%
3,0%
4,0%
5,0%
6,0%
7,0%
17.apr 24.apr 1.maj 8.maj 15.maj
News Personalization Test Page-Story Page Transition Probabilities
Control JSI SVM Random JSI CF DailyMe Personalized Most Popular ContextualCompetitor
![Page 85: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/85.jpg)
ailab.ijs.sihttp://log3.quintelligence.com/test/rec/test-svmcfni.html
![Page 86: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/86.jpg)
ailab.ijs.si
PERSONALIZED NEWS SEARCH
Lorand Dali, Blaž Fortuna
![Page 87: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/87.jpg)
ailab.ijs.si
Personalized News Search –System Architecture
Ranking Model
Learning to Rank
Query
Search
Logs
keywords
User
−age
−country
−gender
−income
−industry
−job
![Page 88: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/88.jpg)
ailab.ijs.si
User: Young female computer programmer
Query: Religion
![Page 89: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/89.jpg)
ailab.ijs.si
User: Middle aged male clergy
Query: Religion
![Page 90: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/90.jpg)
ailab.ijs.si
Videolectures.net562 events, 8169 authors, 10539 lectures,
12859 videos
![Page 91: Machine Learning and Knowledge Discovery for Semantic Webtranslectures.videolectures.net/site/normal_dl/tag=... · 2011-05-27 · Semantic Web integrates many existing ideas and technologies](https://reader034.fdocuments.in/reader034/viewer/2022042109/5e89044abd204b2acc57bc22/html5/thumbnails/91.jpg)
ailab.ijs.si
Montreal @ Video Lectures