Session II: Scientific Publishing and Semantic Web W3C Semantic Web for Life Sciences Workshop...

15
Session II: Scientific Publishing and Semantic Web W3C Semantic Web for Life Sciences Workshop October 27, 2004 Moderator: Alan R. Aronson

Transcript of Session II: Scientific Publishing and Semantic Web W3C Semantic Web for Life Sciences Workshop...

Page 1: Session II: Scientific Publishing and Semantic Web W3C Semantic Web for Life Sciences Workshop October 27, 2004 Moderator: Alan R. Aronson.

Session II: Scientific Publishing and Semantic Web

W3C Semantic Web for Life Sciences WorkshopOctober 27, 2004

Moderator: Alan R. Aronson

Page 2: Session II: Scientific Publishing and Semantic Web W3C Semantic Web for Life Sciences Workshop October 27, 2004 Moderator: Alan R. Aronson.

Foundations of Semantic Text Processing at NLMAlan R. Aronson

(National Library of Medicine)

Urchin RSS / The Urchin/Kowari Project Ben Lund, David Wood

(Nature Publishing Group, Tucana Technologies)

Semantic Web and ElsevierMarc Krellenstein

(Elsevier)

Semantic Web for Data Interpretation & Integration:Lessons Learned from Scientific Publishing

and the Distributed Annotation SystemSteve Chervitz

(Affymetrix)

Page 3: Session II: Scientific Publishing and Semantic Web W3C Semantic Web for Life Sciences Workshop October 27, 2004 Moderator: Alan R. Aronson.

Foundations of Semantic Text Processing at NLM

Alan R. Aronson, PhD

National Library of Medicine

W3C Semantic Web for Life Sciences Workshop

October 27, 2004

Page 4: Session II: Scientific Publishing and Semantic Web W3C Semantic Web for Life Sciences Workshop October 27, 2004 Moderator: Alan R. Aronson.

Outline

• Unified Medical Language System (UMLS) Knowledge Sources

• The MetaMap Program• The NLM Indexing Initiative• SemRep (Semantic Representation)

Page 5: Session II: Scientific Publishing and Semantic Web W3C Semantic Web for Life Sciences Workshop October 27, 2004 Moderator: Alan R. Aronson.

The Unified Medical Language System

• UMLS Knowledge Sources• Metathesaurus

• Semantic Network

• SPECIALIST Lexicon

• MetamorphoSys (Metathesaurus subset extraction)• Lexical/spelling tools (lvg, norm, Gspell)• Knowledge Source Server

Page 6: Session II: Scientific Publishing and Semantic Web W3C Semantic Web for Life Sciences Workshop October 27, 2004 Moderator: Alan R. Aronson.

MetaMap

• Maps text to the Metathesaurus• Parse text into phrases

• Generate word variants

• Retrieve Metathesaurus candidates

• Evaluate candidates against text phrases

• Form final mapping

• Linguistically rigorous• Partial matching• Web interface and Java-based application

Page 7: Session II: Scientific Publishing and Semantic Web W3C Semantic Web for Life Sciences Workshop October 27, 2004 Moderator: Alan R. Aronson.

NLM Indexing Initiative (II)

• Investigate automated and semi-automated indexing methodologies

• Develop methods that result in acceptable retrieval performance• Concept-based algorithms

• Extensive use of UMLS resources

• Medical Text Indexer (MTI), a tool for• semi-automated assistance in MEDLINE indexing

• automatic indexing of some abstracts collections

Page 8: Session II: Scientific Publishing and Semantic Web W3C Semantic Web for Life Sciences Workshop October 27, 2004 Moderator: Alan R. Aronson.

SemRep

• Family of programs to extract semantic relationships from biomedical text• SemRep (the progenitor)

• Arbiter (binding relationships)

• EDGAR (drug-gene relationships)

• SemSpec (hypernymic propositions)

• SemGen (etiology of genetic diseases)

Page 9: Session II: Scientific Publishing and Semantic Web W3C Semantic Web for Life Sciences Workshop October 27, 2004 Moderator: Alan R. Aronson.

Language and MeaningLanguage

Meaning Semantic InterpretationSemantic Relation(Concept,Concept)

World ModelRelations Entities

Words

Syntactic StructurePredicates Arguments

Page 10: Session II: Scientific Publishing and Semantic Web W3C Semantic Web for Life Sciences Workshop October 27, 2004 Moderator: Alan R. Aronson.

Lexical Look-up and Tagger

combination chemotherapy

noun

aggressive

adj

in

prep

the

det

management

noun

failure of

prep

hypercalcemic

adj

renal

noun nounnoun

Page 11: Session II: Scientific Publishing and Semantic Web W3C Semantic Web for Life Sciences Workshop October 27, 2004 Moderator: Alan R. Aronson.

Parser

combination

noun

mod

chemotherapy

noun

head

aggressive

adj

mod

in

prep

prep

the

det

det

management

noun

head

failure of

prep

prep

hypercalcemic

adj

mod

renal

noun

mod

noun

head

NP NP NP

Page 12: Session II: Scientific Publishing and Semantic Web W3C Semantic Web for Life Sciences Workshop October 27, 2004 Moderator: Alan R. Aronson.

MetaMap

in

prep

prep

the

det

det

management

noun

head

NP

failure of

prep

prep

hypercalcemic

adj

mod

renal

noun

mod

noun

head

NP

combination

noun

mod

chemotherapy

noun

head

aggressive

adj

mod

NP

Drug Therapy, Combination

topp

Disease or Syndrome

Kidney Failure

dsyn

Therapeutic or Preventive Procedure

Page 13: Session II: Scientific Publishing and Semantic Web W3C Semantic Web for Life Sciences Workshop October 27, 2004 Moderator: Alan R. Aronson.

SemRep

in

prep

prep

the

det

det

management

noun

head

NP

failure of

prep

prep

hypercalcemic

adj

mod

renal

noun

mod

noun

head

NP

combination

noun

mod

chemotherapy

noun

head

aggressive

adj

mod

NP

Drug Therapy, Combination

topp

Kidney Failure

dsyn

Dependency grammar applies syntactic constraints for nominalization

Page 14: Session II: Scientific Publishing and Semantic Web W3C Semantic Web for Life Sciences Workshop October 27, 2004 Moderator: Alan R. Aronson.

SemRep

in

prep

prep

the

det

det

management

noun

head

NP

failure of

prep

prep

hypercalcemic

adj

mod

renal

noun

mod

noun

head

NP

combination

noun

mod

chemotherapy

noun

head

aggressive

adj

mod

NP

Drug Therapy, Combination

topp

Kidney Failure

dsyn

TREATS

Match semantic types between arguments and Semantic Network

medd-TREATS-dsyn

phsu-TREATS-dsyn

topp-TREATS-dsyn

topp-TREATS-inpo

topp-TREATS-sosy

topp-TREATS-anab

Page 15: Session II: Scientific Publishing and Semantic Web W3C Semantic Web for Life Sciences Workshop October 27, 2004 Moderator: Alan R. Aronson.

NLM Web Pointers

• UMLS Knowledge Source Server: http://umlsks.nlm.nih.gov/

• Semantic Knowledge Representation Project: http://skr.nlm.nih.gov/

• NLM Indexing Initiative: http://ii.nlm.nih.gov/