Marko Grobelnik, Janez Brank, Blaž Fortuna, Igor Mozetič.

download Marko Grobelnik, Janez Brank, Blaž Fortuna, Igor Mozetič.

If you can't read please download the document

description

What is ontology?  Ontology is a data model that represents a set of concepts within a domain and the relationships between those concepts.  Generally it consist of  Classes: sets, collections, or types of objects  Instances: the basic or "ground level" objects  Relations: ways that objects can be related to one another  It can be used  … as schema for knowledge management system,  … to reason about the objects within that domain,  etc.

Transcript of Marko Grobelnik, Janez Brank, Blaž Fortuna, Igor Mozetič.

Marko Grobelnik, Janez Brank, Bla Fortuna, Igor Mozeti Outline Ontology Ontolight Definition Grounding Population Applications Integration in OntoGen Demo What is ontology? Ontology is a data model that represents a set of concepts within a domain and the relationships between those concepts. Generally it consist of Classes: sets, collections, or types of objects Instances: the basic or "ground level" objects Relations: ways that objects can be related to one another It can be used as schema for knowledge management system, to reason about the objects within that domain, etc. Sample Ontology Examples of Real-world Ontologies AgroVoc Multilingual thesaurus for the field of Agriculture, Forestry, Fisheries, Food Security and related stuff Consists of terms in different languages, thesaurus relationships between terms Broader, narrower, related ASFA Thesaurus used for annotating bibliography related to aquatic science literature EuroVoc Multilingual thesaurus used by European institutions Acquis Communitarian corpus is annotated by EuroVoc Cyc Knowledge base, formalization of fundamental human knowledge Dmoz The Open Directory Project Worlds largest directory of WWW, maintained by volunteer editors What is Ontolight? Simple model covering most of the well known light-weight ontologies Stores ontology like a rich graph Defined as: List of languages used for lexical terms (covers multliliguality) List of class-types (types of nodes in the graph) List of classes (nodes in the graph) List of relation types (types of links in the graph) List of relations (links in the graph) Grounding model A function which proposes a set of classes for a given instance Classification in machine learning Grounding Mutliclass classification model trained on the instances of ontology In case of Dmoz web pages In case of EuroVoc EU legislation We used centroid-based classifier Calculates a centroid vector for each class Uses knowledge of hierarchy Classification performed by kNN algorithm Highly scalable can handle 100s of thousands of classes Population Takes instance as an input Output is a list of suggested classes Example from EuroVoc Instance: Slovenia and Croatia are having a fishing industry Output: OntoGen Ontology construction and learning Semi-Automatic: Text-mining methods provide suggestions and insights into the domain The user can interact with parameters of text-mining methods All the final decisions are taken by the user Data-Driven: Most of the aid provided by the system is based on some underlying data provided by the system Instances are described by features extracted from the data (e.g. bag-of-words vectors) Concept hierarchy List of suggested sub-concepts Ontology visualization Selected concept Concepts details Concepts instance management Selected concept Keywords Selected instance Contextualized ontology generation Ontolight is integrated with Ontogen Helps at new ontology generation by means of existing ontologies User loads Ontolight into Ontogen at start Suggestion methods: Concept suggestion Offers concepts from loaded Ontolight as possible sub-concepts Name suggestion Offers names of concepts from Ontolight as possible concept names All suggestions are integrated in semi-automatic manner Concept suggestion User selects concept User selects Ontolight OntoGen classifies each document into context Ontolight ontology Concepts with most documents are provided as suggestions to the user Name suggestion User selects concept OntoGen classifies each document into context loaded Ontolight ontologies Names of concepts with most classified documents are provided as suggestions to the user AgroVoc and EuroVoc applied to Yahoo finance data Demo