Robert Stevens Ebi
-
Upload
farid-naufal-aslam -
Category
Documents
-
view
226 -
download
0
Transcript of Robert Stevens Ebi
-
8/8/2019 Robert Stevens Ebi
1/25
http://img.cs.man.ac.uk/stevens 1
Building and Using Ontologies
Robert StevensDepartment of Computer Science
University of Manchester
Manchester UK
-
8/8/2019 Robert Stevens Ebi
2/25
http://img.cs.man.ac.uk/stevens 2
Introduction The nature of bioinformatics resources
W
hat is knowledge? What is an ontology?
What are the uses of ontologies?
Components of an ontology
Building an ontology (in brief)
-
8/8/2019 Robert Stevens Ebi
3/25
http://img.cs.man.ac.uk/stevens 3
The Nature of Bioinformatics
Resources Over 500 databanks and analysis tools that work over
resources
Repositories of knowledge and data and generation ofnew knowledge
Knowledge often held as free text; some use made ofcontrolled vocabularies
Enormous amount of semantic heterogeneity and poorquery facilities
Knowledge about services not always apparent
-
8/8/2019 Robert Stevens Ebi
4/25
http://img.cs.man.ac.uk/stevens 4
What is Knowledge?
Knowledge all informationand an understanding tocarry out tasks and to infernew information
Information -- data equippedwith meaning
Data -- un-interpretedsignals that reach oursenses
PATRICIAGRACEKENNEDY
SAIDMINEISAPINT
Patricia Grace
Kennedy said
mine is a pintname noun verb
Pat Baker is aManchester
bioinformatician
who drinks beer.
CEKENNSingle letter amino
acid codes
C cysteine
K - lysine
Protein that acts asa tyrosine kinase in
the liver of primates.
-
8/8/2019 Robert Stevens Ebi
5/25
http://img.cs.man.ac.uk/stevens 5
Capturing Knowledge Capturing knowledge for both humans an computer
applications
A set of vocabulary definitions that capture acommunitys knowledge of a domain
`An ontology may take a variety of forms, butnecessarily it will include a vocabulary of terms, and
some specification of their meaning.T
his includesdefinitions and an indication of how concepts are inter-related which collectively impose a structure on thedomain and constrain the possible interpretations ofterms.'
-
8/8/2019 Robert Stevens Ebi
6/25
http://img.cs.man.ac.uk/stevens 6
What Does an Ontology Do? Captures knowledge
Creates a shared understanding between
humans and for computers
Makes knowledge machine processable
Makes meaning explicit by definition and
context
-
8/8/2019 Robert Stevens Ebi
7/25
http://img.cs.man.ac.uk/stevens 7
What is an Ontology?
Catalog/
ID
General
Logical
constraints
Terms/
glossary
Thesauri
narrower
term
relation Formal
is-a
Frames
(properties)
Informal
is-a
Formal
instance
Value Restrs. Disjointness,
Inverse, part
of
-
8/8/2019 Robert Stevens Ebi
8/25
http://img.cs.man.ac.uk/stevens 8
Roles of Ontologies in
Bioinformatics We can divide ontology use into three types:
Domain-oriented, which are either domain specific (e.g.
E. coli) or domain generalisations (e.g. gene function orribosomes);
Task-oriented, which are either task specific (e.g.annotation analysis) or task generalisations (e.g.
problem solving); Generic, which capture common high level concepts,
such as Physical, Abstract and Substance. Important inontology management and language applications.
-
8/8/2019 Robert Stevens Ebi
9/25
http://img.cs.man.ac.uk/stevens 9
Uses of Ontology Community reference -- neutral authoring.
Either defining database schema or defining a common
vocabulary for database annotation -- ontology asspecification.
Providing common access to information. Ontology-based search by forming queries over databases.
Understanding database annotation and technicalliterature.
Guiding and interpreting analyses and hypothesisgeneration
-
8/8/2019 Robert Stevens Ebi
10/25
http://img.cs.man.ac.uk/stevens 10
Components of an Ontology Concepts: Class of individuals The concept
Protein and the individual`human cytochrome C
Relationships between concepts
Is a kind of relationship forms a taxonomy
Other relationships give further structure is a
part of
Axioms Disjointness, covering, equivalence,
-
8/8/2019 Robert Stevens Ebi
11/25
http://img.cs.man.ac.uk/stevens 11
Knowledge Representation
Ontology are best delivered in some computablerepresentation
Variety of choices with different:
Expressiveness
The range of constructs that can be used to formally,
flexibly, explicitly and accurately describe the ontology
Ease of use
Computational complexity
Is the language computable in real time?
Rigour -- Satisfiability and consistency of the
representation
Systematic enforcement mechanisms
Unambiguous, clear and well defined semantics
-
8/8/2019 Robert Stevens Ebi
12/25
http://img.cs.man.ac.uk/stevens12
Languages Vocabularies using natural language
Hand crafted, flexible but difficult to evolve, maintain and
keep consistent, with weak semantics
Gene Ontology
Object-based KR: frames Extensively used, good structuring, intuitive. Semantics
defined byOKBC standard
EcoCyc (uses Ocelot) and RiboWeb (uses Ontolingua)
Logic-based: Description Logics Very expressive, model is a set of theories, well defined
semantics
Automatic derived classification taxonomies
Concepts are defined and primitive
-
8/8/2019 Robert Stevens Ebi
13/25
http://img.cs.man.ac.uk/stevens13
Building Ontologies No field ofOntologicalEngineering equivalent to
Knowledge or Software Engineering;
No standard methodologies for building ontologies;
Such a methodology would include:
a set of stages that occur when building ontologies;
guidelines and principles to assist in the different stages;
an ontology life-cycle which indicates the relationships
among stages.
-
8/8/2019 Robert Stevens Ebi
14/25
http://img.cs.man.ac.uk/stevens14
The Development Lifecycle Two kinds of complementary methodologies emerged:
Stage-based, e.g. TOVE[Uschold96]
Iterative evolving prototypes, e.g. MethOntology [Gomez Perez94].
Most have TWO stages:
1. Informal stage ontology is sketched out using either natural language descriptions or some
diagram technique
2. Formal stage
ontology is encoded in a formal knowledge representation language, that is
machine computable
the informal representation helps the former
the formal representation helps the latter.
-
8/8/2019 Robert Stevens Ebi
15/25
http://img.cs.man.ac.uk/stevens15
A Provisional Methodology
A skeletal methodology and life-cycle for buildingontologies;
Inspired by the software engineering V-process model;
The overall process moves through a life-cycle.
The left side
charts the
processes in
building anontology
The right side charts the
guidelines, principles and
evaluation used to quality
assure the ontology
-
8/8/2019 Robert Stevens Ebi
16/25
http://img.cs.man.ac.uk/stevens16
The V-model Methodology
Conceptualisation
Integrating existingontologies
Encoding
Representation
Identify purpose and scope
Knowledge acquisition
Evaluation: coverage,verification, granularity
ConceptualisationPrinciples: commitment,conciseness, clarity,extensibility, coherency
Encoding/Representationprinciples: encoding bias,consistency, house stylesand standards, reasoningsystem exploitation
Ontology in Use
User Model
Conceptualisation Model
Implementation Model
-
8/8/2019 Robert Stevens Ebi
17/25
http://img.cs.man.ac.uk/stevens17
The ontology building life-
cycleIdentify purpose and scope
Knowledge acquisition
Evaluation
Language andrepresentation
Availabledevelopmenttools
Conceptualisation
Integrating
existingontologiesEncoding
Building
-
8/8/2019 Robert Stevens Ebi
18/25
http://img.cs.man.ac.uk/stevens18
Starting Concept List Chemicals atom, ion, molecule, compound, element;
Molecular-compound, ionic-compound, ionic-molecular-
compound, ;
Ionic-macromolecular-compound and ionic-small-
macromolecular-compound;
Protein, peptide, polyprotein, enzyme, holoprotein,apoprotein,
Nucleic acid DNA, RNA, tRNA, mRna, snRNA,
-
8/8/2019 Robert Stevens Ebi
19/25
http://img.cs.man.ac.uk/stevens19
Conceptualisation SketchChemical
AtomElementCompoundMolecule Ion
MetalNon-Metal
Metaloid
Molecular
Compound
Molecular
Element
Ionic
Compound
Ionic
Molecule
Ionic Molecular
Compound
-
8/8/2019 Robert Stevens Ebi
20/25
http://img.cs.man.ac.uk/stevens20
Molecule Conceptualisation
Sketch
Nucleic
Acid
ProteinPolysaccharide
DNA RNAEnzyme
Macromolecule Small
Molecule
Ionic Macromolecular
Compound
Starch Glycogen
mRNA tRNA rRNAsnRNA
Peptide
-
8/8/2019 Robert Stevens Ebi
21/25
http://img.cs.man.ac.uk/stevens21
Initial Encodingclass-def chemical
subclass-of substance
class-def molecule
subclass-of chemical
class-def compound
subclass-of chemical
class-def molecular-compound
subclass-of molecule and compound
-
8/8/2019 Robert Stevens Ebi
22/25
http://img.cs.man.ac.uk/stevens22
Molecules Revisited
Nucleic
Acid
ProteinPolysaccharide
DNA RNAEnzyme
Macromolecule Small
Molecule
Ionic Macromolecular
Compound
Starch Glycogen
mRNA tRNA rRNAsnRNA
Peptide
Non-Ionic Macromolecular
Compound
-
8/8/2019 Robert Stevens Ebi
23/25
http://img.cs.man.ac.uk/stevens
23
More Encodingclass-def chemical
subclass-of substance
class-def defined molecule
subclass-of chemical
Slot-constraint contains-bond min-cardinality 1 has-value covalent-bond
class-def defined compound
subclass-of chemical
Slot-constraint has-atom-types greater-than 1
class-def defined molecular-compound
subclass-of molecule and compound
-
8/8/2019 Robert Stevens Ebi
24/25
http://img.cs.man.ac.uk/stevens
24
Expansion Sketch and encode in cycles
Build a taxonomy of a small portion
Then build links to other portions
Add more detail
Document sources, author, date andargumentation.
-
8/8/2019 Robert Stevens Ebi
25/25
http://img.cs.man.ac.uk/stevens
25
Summary An ontology captures knowledge for a shared
understanding
The important question is not whether an artefact is anontology, but whether it does any good
Making our understanding of domain explicit, consistent
and processable
Bioinformatics resources are knowledge resources
needs to be both human and machine understandable