A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M....

51
A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California

Transcript of A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M....

Page 1: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

A Real-World Knowledge Engineering Application:The NeuroScholar ProjectGully APC Burns

K. M. Research Group University of Southern California

Page 2: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

Structure of the presentation

1. Ideas & Concepts2. Design3. Implementation4. Demonstration

Page 3: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

I. Ideas & Concepts

In which we are reminded of what most people think knowledge is, how it is currently used (and

misused) and how we might improve matters.

Page 4: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

Main Entry: knowl·edge Pronunciation: 'nä-lijFunction: nounEtymology: Middle English knowlege, from knowlechen to acknowledge, irregular from knowenDate: 14th century1 obsolete : COGNIZANCE2 a (1) : the fact or condition of knowing something with familiarity gained through experience or association (2) : acquaintance with or understanding of a science, art, or technique b (1) : the fact or condition of being aware of something (2) : the range of one's information or understanding <answered to the best of my knowledge> c : the circumstance or condition of apprehending truth or fact through reasoning : COGNITION d : the fact or condition of having information or of being learned <a man of unusual knowledge>3 archaic : SEXUAL INTERCOURSE4 a : the sum of what is known : the body of truth, information, and principles acquired by mankind b archaic : a branch of learning

What does the word ‘Knowledge’ mean?

[from http://www.m-w.com/]

Page 5: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

The published literature

Image taken from U.S. Geological Survey Energy Resource Surveys Program

… is the end-product of research and as such forms the basis for human understanding of the subject

… is very valuable.

… is structured.

… is interpretable.

Page 6: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

The published literature

Image taken from U.S. Geological Survey Energy Resource Surveys Program

… is large and unwieldy.… has varying reliability.… is inconsistent.… is based on natural language. … is difficult to automate.… is terse… is qualitative… is 2-D

Page 7: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

The published literature

Image taken from U.S. Geological Survey Energy Resource Surveys Program

… is a valid target for attack with informatics-based methods. This permits …(a) Increased clarification through formalization (b) large-scale data-handling capability(c) analysis of existing data to examine organization

Page 8: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

A semantic continuum

[Mike Uschold, Boeing Corp]

Shared human consensus

Text descriptions

Semantics hardwired; used at runtime

Semantics processed and used at runtime

Implicit Informal(explicit)

Formal(for humans)

Formal(for machines)

Further to the right means: • Less ambiguity• More likely to have correct functionality• Better inter-operation (hopefully)

• Less hardwiring• More robust to change• More difficult

The current status of ‘theory’ in NeuroscienceThe current status of ‘theory’ in NeuroscienceHow we would like neuroscientists to thinkHow we would like neuroscientists to thinkWhere we would like to workWhere we would like to work

Page 9: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

What’s wrong with this picture?…from a neuroscientist’s point of view…

From Swanson (1998), “Brain Maps, Structure of the Rat Brain”, 2nd edition, Elsevier, Amsterdam.

Number of structures = 500 x 2

Number of Cell Groups per structure

= 10

Number of Possible Connections between cell groups

= 10,000 x 10,000

= 108

Estimated Number of Connections between cell groups

= 250,000

Page 10: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

… it’s even worse than that …

Neuroscience is extremely multidisciplinarySpatial Scales of Measurement: 101 – 10-9 mTemporal Scales of Measurement: 70 yrs (2.21x109 s) to 10-3 s(not even including evolutionary time!)

Study occurs in a heterogeneous theoretical framework involving:

Anatomy, Physiology, Psychology, Ethology, Biochemistry (Molecular Biology, Genetics, Bioinformatics), Biophysics, Behavioural Ecology, Biology … to name a few…

All of these subjects are specialized, hard to link work between disciplines and across levels

Page 11: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

… & it’s even worse than that !!!

Neuroanatomical nomenclature are the closest thing that neuroscience has for a standardized framework…

In any given paper, the same name may be used for different structures, or different names may be used different structures.

e.g., ‘Globus Pallidus, pars medialis (GPm)’ also called the ‘Entopeduncular Nucleus’ by others.

See the index of Swanson (1998), “Brain Maps, Structure of the Rat Brain”, 2nd edition, Elsevier, Amsterdam list of synonyms according to one source.

Page 12: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

We restrict the problem space to a specific soluble strategy

1. Describe a given phenomenon (e.g., the stress response).

2. Identify which populations of neurons are involved in the phenomenon (i.e., any neurons that turn on, turn off, change their firing, affect the phenomenon if messed with, etc.).

3. Represent how these populations of neurons are interconnected.

4. Represent the dynamic processes of there neurons that underlie the phenomenon.

Page 13: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

A Construct: ‘A Knowledge Model’

= A personalized representation of an

individual’s knowledge.

e.g., A review article is an example of a non-computational knowledge

model

Page 14: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

Another Construct: ‘Knowledge Landscape’

= A map of Knowledge Models (where each KM

is timestamped)

e.g., An list of the best reviews of a given subject over time is an

example of a non-computational knowledge landscape

Page 15: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

II. Design

In which all of these high-falutin’ ideas are put into a logical design and it becomes clear that the

design criteria of the NeuroScholar project distinguish it from pure research in computer

science

Page 16: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

Some design requirements

In order of importance1. Powerful & enabling to

neuroscientists in their everyday work

2. Easy to use! (i.e., free, multi-platform, one-click installation)

3. Knowledge acquisition / data collation is the rate limiting step

4. Open-source for future development as an academic project.

Page 17: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

Knowledge Landscapes

NeuroScholar Screenshot- (dummy data)

Page 18: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

Knowledge Landscapes

‘Knowledge Landscape’

‘Knowledge Model’

‘Fragments’

‘Entities’

‘Properties’ ‘Relations’

‘Annotations’

‘Data Collection’

NeuroScholar Screenshot- (dummy data)

Page 19: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

‘Fragments’

‘Entities’

‘Properties’ ‘Relations’

‘Annotations’

‘Data Collection’ A set of data fragmentse.g. a publication: Allen GV & DF Cechetto. (1993) J Comp Neurol 330:421-438.

Knowledge Models & examples

Page 20: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

‘Entities’

‘Properties’ ‘Relations’

‘Annotations’

‘Data Collection’

‘Fragments’ individual pieces of the literaturee.g. descriptions of experimental results.“… Moderate to light terminal labeling was present in the parvocellular portions of the paraventricular nucleus, anterior-hypothalamic nucleus, anterior portion of the lateral hypothalamic area (Figs. 2D, 3B), and in the central nucleus of the amygdala (Fig, 2D)….”

From Allen & Cechetto (1993)

Knowledge Models & examples

Page 21: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

‘Fragments’

‘Relations’

‘Annotations’

‘Data Collection’Abstract data structures that capture the meaning of a set of fragments within the framework of the NeuroScholar system

‘Entities’

‘Properties’

injectionSite labeling

labeling

experimentalMethod

e.g. neuronPopulation object

knowledge type = descriptiondomain type = tract-tracing experiment

brainVolumes

Knowledge Models & examples

Page 22: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

‘Fragments’

‘Entities’

‘Properties’ ‘Relations’

‘Annotations’

‘Data Collection’Rules that link two objects together.

‘Relations’

LHA

ZI

Knowledge Models & examples

Page 23: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

‘Fragments’

‘Entities’

‘Properties’ ‘Relations’

‘Annotations’

‘Data Collection’

‘Summaries’

Sets of objects and relations, explicitly selected and prioritized within system

Knowledge Models & examples

neuronPopulation2

neuronPopulation1

Page 24: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

‘Fragments’

‘Objects’

‘Properties’ ‘Relations’

‘Annotations’

‘Data Collection’

Human-interpretable text to make contents of knowledge base understandable

‘Annotations’

Knowledge Models & examples

Page 25: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

Distributed Online Sources of Information

‘Fragments’

Local Implementation

Page 26: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

Distributed Online Sources of Information

‘Fragments’

Local Implementation

Users’ Spaces & Models

Centralized Published KnowledgeRepository

Page 27: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

Distributed Online Sources of Information

Users’ Spaces & Models

‘Fragments’

‘Pending Review’

Page 28: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

Distributed Online Sources of Information

Users’ Spaces & Models

‘Fragments’

P2P sharing

KnowledgeModelComparison

Page 29: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

Knowledge Model Comparison

Given two users A & B, with Knowledge Models KA & KB being shared under the P2P model.

We want A to be able to run a program that automatically compares KB to KA so that the discrepancies and contradictions between the two models can be understood and reconciled.

Page 30: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

What’s wrong with this picture?…from an computer scientist’s point of view…

Where is the formal logic?

It’s o.k. if we only export knowledge models to a formal logic-based representation

rather that base our entire approach on it.

Knowledge Acquisition is the rate-limiting step!

Page 31: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

Knowledge Representation

Knowledge representation is a multidisciplinary subject that applies theories and techniques from three other fields:

1. Logic provides the formal structure and rules of inference.

2. Ontology defines the kinds of things that exist in the application domain.

3. Computation supports the applications that distinguish knowledge representation from pure philosophy…

Sowa (2000), Knowledge Representation: Logical, Philosophical, and Computational Foundations, Brooks Cole Publishing Co., Pacific Grove, CA.

Page 32: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

Knowledge Representation

… Without logic, a knowledge representation is vague, with no criteria for determining whether statements are redundant or contradictory. Without ontology, the terms and symbols are ill-defined, confused, and confusing. And without computable models, the logic and ontology cannot be implemented in computer programs. Knowledge representation is the application of logic and ontology to the task of constructing computable models for some domain.  

Sowa (2000), Knowledge Representation: Logical, Philosophical, and Computational Foundations, Brooks Cole Publishing Co., Pacific Grove, CA.

Page 33: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

III. Implementation

In which the design issues become concerned with more pressing concerns like: ‘how are we

actually going to build this thing?’

Page 34: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

Some implementation choices

Built under UML-based software engineering paradigm The View-Primitive-Data-Model framework (‘VPDMf’)

Object Oriented Design Unified Modeling Language (UML) PerlOO Java

Relational Databases MySQL Informix

Exporting Ontologies (via the VPDMf) XML, RDF, Flogic

Exporting Logic Embedded within typed Relation objects within the OO knowledge model. Use simple method overloading in Java to run Knowledge Model

Comparison

Page 35: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

VPDMf System Builder

VPDMf specs(Data Model file &VPDMf XML files)

UML-based documentation

DBMS

User Interface

Component

Final Working System

Forward Engineering

Reverse Engineering

Page 36: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

Implementation Plan

MainDatabase

PluginsVPDMfClientApp

LocalDatabase

ServerClient

ReviewDatabase

VPDMfAdminApp

Plugins

Page 37: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

Implementation Plan

MainDatabase

PluginsVPDMfClientApp

LocalDatabase

LocalApps

ServerClient

ReviewDatabase

VPDMfSystemBuilder

VPDMfAdminApp

Plugins

Page 38: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

Implementation Plan

MainDatabase

PluginsVPDMfClientApp

LocalDatabase

ServerClient

ReviewDatabase

VPDMfAdminApp

Plugins

Demonstration

Page 39: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

Large scale organization of NeuroScholar’s schemaData management of

publication dataGeneral knowledge management structures

Annotations, Justifications, JudgementsExperimental data,

General histological dataNeuroanatomical tract tracing dataFinal output of the system: the knowledge model

Components of the knowledge model specific to neuronal data

General data constructs used throughout the system

Page 40: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

e.g., Views from ‘bibliography’

Excerpt

bl_x : int32 = 0bl_y : int32 = 0tr_x : int32 = 0tr_y : int32 = 0li_x : int32 = 0li_y : int32 = 0ri_x : int32 = 0ri_y : int32 = 0pagenumber : int32

enclose_excerpt()enclose_corner_area()

Journal

JournalTitle : StringPublisherName : StringAbbr : StringISSN : String

Author

Affiliation : string(40)LastName : string(40)Initials : string(10)

Fragment

fragment_type : object(CV)

1..n

1

+excerpts1..n

+fragment1

Article

Pages : string(10)Volume : int32Issue : int32Abstract : StringPMID : int32Title : string(255)Language : object(CV)PubDate : yearchecksum : Stringsize : int32

0..*

1

0..*

+journal

1

1..n

1..n

+publishedWork

1..n

+authorList

1..n

<<ordered>> 0..n1+fragments0..n

+publication

1

CV

name : Stringcontext : Stringdescription : String

(from coreSystem)

Page 41: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

ViewDefinitionArticle

ViewDefinitionFragment

ViewLink

Page 42: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

ViewLink

Page 43: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

Basic Functionality: The ViewStateMachine & Forms

Query

Insert List

Display

Edit

Execute( viewInstance ) /

runExecute

Select( viewType, viewID

) / runSelectCommit( viewInstance ) / run Commit

ClearInsert( viewType ) /

runClearInsert

Update( viewInstance ) / runUpdate

Cancel( viewType, viewID ) / runCancel

Start

Query( viewType ) / runQueryInsert(

viewType ) / runInsert JumpToStart

JumpToQuery

JumpToInsert

JumpToDisplay( viewType, ViewID )

Edit( viewInstance ) / runEdit

Delete( viewType, viewID ) / runDelete

BackToQuery( viewInstance )

BackToList( viewInstance )

Page 44: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

Additional Functionality: Specialized Form Controls &Plugins1. The Article Robot Form Control

Uses PubMed to retrieve citation information easily

2. The Fragmenter PluginAllows delineation of fragments on pdf files

3. The AtlasMapper PluginAllows delineation of regions on brain

maps

Page 45: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.
Page 46: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.
Page 47: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.
Page 48: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.
Page 49: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.
Page 50: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

IV. Demonstration

In which the truth is finally revealed

Page 51: A Real-World Knowledge Engineering Application: The NeuroScholar Project Gully APC Burns K. M. Research Group University of Southern California.

Acknowledgements

This work is funded by the National Library of Medicine (RO1-LM07061-01)Thanks toArshad KhanShahram Ghandehanderazdeh Cyrus ShahabiMark O’NeillLarry SwansonAlan WattsMihail Bota

Wei Cheng ChenShyam KapadiaShanshan Song Ning Zhang Yi-Shin Chen