Introductory Review of Current Knowledge Organization ...

67
Introductory Review of Current Knowledge Organization Systems/Structures/Services (KOS) Marcia Lei Zeng Second International Seminar on Subject Access to Information, Helsinki, Finland, 29-30 November 2007

Transcript of Introductory Review of Current Knowledge Organization ...

Page 1: Introductory Review of Current Knowledge Organization ...

Introductory Review ofCurrentKnowledge OrganizationSystems/Structures/Services(KOS)

Marcia Lei ZengSecond International Seminar on SubjectAccess to Information, Helsinki,Finland, 29-30 November 2007

Page 2: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 2

Purpose of this talk

• Introduce different types ofknowledge organizationsystems/structures/services(KOS)

• Provide a commonterminology and background

Page 3: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 3

1. KOS overview (1)

Knowledge organizationsystems/structures/services(KOS) encompass all types ofschemes for organizinginformation and promotingknowledge management.– (Gail Hodge, 2000)

Page 4: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 4

1. KOS overview (2)

These systems• model the underlying semantic

structure of a domain, and• provide semantics, navigation, and

translation through labels,definitions, typing, relationships,and properties for concepts.– (Hill et al. 2002, Koch and Tudhope 2004).

Page 5: Introductory Review of Current Knowledge Organization ...

A Taxonomy of KOS

Term Lists:Authority Files

Synonym Rings

Classification &Categorization:

Subject Headings

Classification schemesTaxonomies

Categorization schemes

Relationship Models: OntologiesSemantic networks

Thesauri

Glossaries/DictionariesPick lists

GazetteersDirectories

Metadata-likeModels:

Function

Structure

Page 6: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 6

2. Fundamentals of KOS Approaches

• 2.1 Eliminating ambiguity• 2.2 Controlling synonyms or

equivalents• 2.3 Making explicit semantic

relationships– Hierarchical relationships– Hierarchical + other associate

relationships• 2.4 Presenting relationships as well

as properties of concepts

Page 7: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 7

2.12.1 Eliminating ambiguityEliminating ambiguity

• Ambiguity: terms having thesame spelling (homographs)that represent differentconcepts or meanings

• Ambiguity exists when a giventerm can be used to representcompletely different concepts.

Page 8: Introductory Review of Current Knowledge Organization ...

Ambiguity / Homographs

Source: Z39.19-2005, p.25

Page 9: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 9

To eliminate ambiguity (1)

1. Adding a qualifier to a term-- one of the major methods used

by almost every type of KOS,especially lists of subjectheadings and thesauri.

• e.g., Mercury (automobile)

Page 10: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 10

2. Providing a scope note-- another major method used by

almost every type of KOS,especially lists of subjectheadings, classifications, andthesauri.

To eliminate ambiguity (2)

Screenshot from MeSHhttp://www.nlm.nih.gov/mesh/MBrowser.htmlEntry: mercury

Page 11: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 11

http://www.nlm.nih.gov/mesh/MBrowser.html

Page 12: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 12

To eliminate ambiguity (3)

3. providing a context of a term

Page 13: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 13

What are these?

• Flying Horse• King Fisher• Royal Challenge• Heineken• Budweiser• Miller-Lite• Bud-Light

Page 14: Introductory Review of Current Knowledge Organization ...

Drinks• Flying Horse• King Fisher• Royal Challenge• Taj Mahal• Hayward’s 2000• Heineken• Corona• Budweiser• Miller-Lite• Bud-Light

Page 15: Introductory Review of Current Knowledge Organization ...

Lists (Picklists)A type of controlled vocabulary induced in

NISO Z39.19 Standard

Page 16: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 16

•• ListsLists are used to describe aspects of contentobjects or entities that have a limited number ofpossibilities.

• Examples include:– geography (e.g., country, state, city),– language (e.g., English, French, Swedish),– format (e.g., text, image, sound), or– … …

Page 17: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 17

Lists can be used effectively forboth browsing and searching.

• In browsingbrowsing, items are directlyaccessed when the list of termsis reviewed and one term isselected

Page 18: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 18

Source: http://www.ncbi.nlm.nih.gov/genome/guide/human/resources.shtml

Page 19: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 19

• In searchingsearching, a list may beused to access content in asingle term search, or the termsfrom the list may be used tolimit a retrieved set by anotherattribute of interest for the user(one or more terms in thesearch).

Page 20: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 20

Source: Google’s advanced search http://www.google.com

Page 21: Introductory Review of Current Knowledge Organization ...

pick lists

Waterford County Image Archivehttp://www.waterfordcountyimages.org

Page 22: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 22

Waterford County Image Archivehttp://www.waterfordcountyimages.org

Page 23: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 23

List - Definition, Purpose, and Uses• A list (also called a pick list) is

a limited set of terms arrangedas a simple alphabetical list orin some other logically evidentway.– A list is a series of terms in some

sequential order.– Terms can be ordered

alphabetically, chronologically,numerically, etc.

Page 24: Introductory Review of Current Knowledge Organization ...

Exercise: Which list isbetter?

Page 25: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 25

• The defining characteristics ofa list are that the terms:· are all members of the same set

or class of items (e.g., countries,products)

· are not overlapping in meaning· are equal in terms of specificity

(granularity)

Page 26: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 26

Typical applications

• Lists are frequently used todisplay small sets of terms thatare to be used for quitenarrowly defined purposessuch as a web pull-down list orlist of menu choices.

Page 27: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 27

2. Fundamentals of KOS Approaches

• 2.1 Eliminating ambiguity•• 2.2 Controlling2.2 Controlling synonymssynonyms oror

equivalentsequivalents• 2.3 Making explicit semantic

relationships– Hierarchical relationships– hierarchical + other associate

relationships• 2.4 Presenting relationships as

well as properties of concepts

Page 28: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 28

2.2 Controlling synonyms or2.2 Controlling synonyms orequivalentsequivalents• Synonyms: terms with the same or

similar meanings1. True synonyms (unusual)

– mean exactly the same thing and areused in precisely the same context

2. Near synonyms (most common)

Page 29: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 29

1. True Synonyms• common and technical names

– salt vs. sodium chloride• changes in usage of terms over time

– electronic calculating machines vs.computers

• in different languages– eyeglasses, spectacles, glasses

• acronyms– BBC, British Broadcasting Company;

MPG, miles per gallon• variant spellings:

– cancelled, canceled; honor, honour

Page 30: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 30

2. Near Synonyms

• Same stem– computing, computers,

computed,microcomputers,supercomputers

• Overlapping concepts– medicine, drugs– fired, laid off– forest, woods– arid, dry

• General andspecific termsCoffee– Double Espresso– Latte– Cappuccino– Short Black– Macchiato– Flat White– etc.

Page 31: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 31

Synonymy

Source: Z39.19-2005, p.25

Page 32: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 32

• Each distinct concept shouldrefer to a unique linguisticform.

• Information or content that isprovided to a user should notspread across the system undermultiple access points, butshould be gathered together inone place.

Page 33: Introductory Review of Current Knowledge Organization ...

… …150 World War, 1939-1945450 European War, 1939-1945450 Second World War, 1939-

1945450 World War 2, 1939-1945450 World War II, 1939-1945450 World War Two, 1939-1945

Source: FAST: FacetedApplication of SubjectTerminologyhttp://fast.oclc.org/

Controlling synonyms: there will only be one term used to representa given concept or entity.

or:

World War, 1939-1945UF European War, 1939-1945UF Second World War, 1939-1945UF World War 2, 1939-1945UF World War II, 1939-1945UF World War Two, 1939-1945

European War, 1939-1945USE World War, 1939-1945

Second World War, 1939-1945USE World War, 1939-1945

World War 2, 1939-1945USE World War, 1939-1945

World War II, 1939-1945USE World War, 1939-1945

World War Two, 1939-1945USE World War, 1939-1945

AuthorityFile

Thesaurus

Page 34: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 34

Source: Art and ArchitectureThesaurus (AAT)

Page 35: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 35

Source: Medical Subject Headings (MeSH)

Page 36: Introductory Review of Current Knowledge Organization ...

Synonym RingsA type of controlled vocabulary induced in

NISO Z39.19 Standard

Page 37: Introductory Review of Current Knowledge Organization ...

astronaut

spaceman cosmonaut

spationaut taikonaut

A synonym ring connects a set of words that aredefined as equivalent for retrieval.

Page 38: Introductory Review of Current Knowledge Organization ...

An example from International SEMATECH.

A search for Silicon would look like this:

Your search was submitted as “CILICON” or “SI”

Page 39: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 39

Synonym Rings are used--• to expand queries for content objects

– If a user enters any one of these terms asa query to the system, all items areretrieved that contain any of the termsin the cluster.

• in systems where the underlyingcontent objects are left in theirunstructured natural languageformat– The control is achieved through the

interface by drawing together similarterms to these clusters.

• in conjunction with search engines

Page 40: Introductory Review of Current Knowledge Organization ...

Poverty mitigation

Poverty alleviation

Poverty elimination

Poverty reducation

Poverty eradication

Poverty abatement

Poverty prevention

Poverty reduction

Rings can include all kinds ofsynonyms - true,misspellings, predecessors,abbreviationsSource: Bedford, 2006 ppt.

Page 41: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 41

Exercise

• Find synonyms of this type ofobject:

Page 42: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 42

2. Fundamentals of KOS Approaches• 2.1 Eliminating ambiguity• 2.2 Controlling synonyms or

equivalents•• 2.3 Making explicit semantic2.3 Making explicit semantic

relationshipsrelationships–– Hierarchical relationshipsHierarchical relationships–– hierarchical + other associatehierarchical + other associate

relationshipsrelationships• 2.4 Presenting relationships as well

as properties of concepts

Page 43: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 43

2.3 Making explicit semantic relationships2.3 Making explicit semantic relationships ––Hierarchical relationshipsHierarchical relationships

BirdsCardinalsDovesRobinsWrens

All specific names ofbirds are kinds of birds.

Page 44: Introductory Review of Current Knowledge Organization ...

Phylum: ChordataClass: Reptilia

Subclass: AnapsidaOrder: Testudines

Suborder: CryptodiraFamily: Dermochelyidae

Genus: DermochelysSpecies: Dermochelys coriacea

(Leatherback turtle)

Scientific TaxonomyAn example: Leatherback turtle

Page 45: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 45

superordinate classes (e.g., parents). coordinate classes (e.g., siblings)

. . subordinate classes (e.g., children). . subordinate classes

. coordinate classes

. coordinate classes. . subordinate classes

relationship types: generic, instance, and whole-part

Classifications

Page 46: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 46

Page 47: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 47

Page 48: Introductory Review of Current Knowledge Organization ...

Part / WholeCause / EffectProcess / AgentAction / ProductAction / PatientConcept or Thing / PropertiesConcept or Thing / OriginsThing or Action / Counter-agentRaw material / ProductAction / Property

Antonyms

Bicycle / Bicycle WheelAccident / InjuryVelocity measurement / SpeedometerWriting / PublicationTeaching / StudentSteel alloy / Corrosion resistanceWater / WellPest / PesticideGrapes / WineCommunication / Communication

skillsSingle people / Married people

Relationship Example

2.3 Making explicit semantic relationships2.3 Making explicit semantic relationships ––Associative relationships (not hierarchical)Associative relationships (not hierarchical)

Page 49: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 49

Page 50: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 50

Page 51: Introductory Review of Current Knowledge Organization ...

Source: Z39.19-2005, p.29

Page 52: Introductory Review of Current Knowledge Organization ...

KOS in Use at World Bank

• Topic Thesaurus (500,000+English terms, French andSpanish language versions inprogress now)

• Topic Classification Scheme(30 top classes, 700+ subtopics,300+ subsubtopics)

• Business Function Thesaurus(50,000 terms and growing)

• Business FunctionClassification Scheme (5business areas, 30 lines ofbusiness, 300+ businessprocesses)

• Country-Region classificationscheme (6 regions, ca. 200countries)

• Content Type ClassificationScheme (8 content types, 300+secondary content types – inrefinement now)

• Media-Format ClassificationScheme

• Country Name Authority Control(synonym, predecessor, successorsources)

• Edition Statements AuthorityControl

• Publisher Name AuthorityControl

• Organization Authority Control• Language Authority Control• Series Name/Collection Title

Authority Control• Translation Type Authority

Control

Source: Bedford, 2007, ASIST

Page 53: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 53

Pick lists Hierarchicaltaxonomy

SynonymRings

SynonymRings

Vision of An Enterprise Advanced Search

Source: Revised based on Bedford, 2006 ppt.

Page 54: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 54

Synonym Rings

Thesaurus

Metadata

Source: Revised based on Bedford, 2006 ppt.

Page 55: Introductory Review of Current Knowledge Organization ...

2. Fundamentals of KOS Approaches• 2.1 Eliminating ambiguity• 2.2 Controlling synonyms or

equivalents• 2.3 Making explicit semantic

relationships– Hierarchical relationships– hierarchical + other associate

relationships

•• 2.4 Presenting2.4 Presentingrelationshipsrelationships as well asas well aspropertiesproperties of conceptsof concepts

Page 56: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 56

2.4 Presenting relationships aswell as properties of concepts

• Entity types• Relationship types• Properties

Page 57: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 57

Semantic networks

organize sets of termsrepresenting concepts,modeled as the nodes in anetwork of variablerelationship types.

Page 58: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 58

UMLS Semantic Network

135 Semantic Types (link) and 54 Semantic Relation Types (link)

Page 59: Introductory Review of Current Knowledge Organization ...
Page 60: Introductory Review of Current Knowledge Organization ...

Source: Noy, N. F. and Tu, S.W. (2003).

Ontologies

Classes

attributes

instances

Page 61: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 61

Page 62: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 62

Page 63: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 63

The Graph view of relations

Page 64: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 64

Page 65: Introductory Review of Current Knowledge Organization ...

A Taxonomy of KOS © 2007 Zeng

OntologiesSemantic networks

Thesauri

Glossaries/DictionariesPick lists

xxxxxpresenting properties

xxxxxxxxxestablishingrelationships: associative

xxxxxxx xxxxestablishingrelationships: hierarchical

xxxxxxxxx xxxxxxcontrolling synonyms

xxxxxxxxx xxxxxeliminating ambiguity

establishing

xestablishing

xx

xxfunction

Two-dimensions

Term Lists: Synonym RingsFlat

structure

Classification &Categorization:

Subject Headings

Classification schemesTaxonomies

Categorization schemes

Relationship Models:

GazetteersDirectories

Authority Files

Metadata -likeModels:

Multipledimensions

Major

func

tions

Page 66: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 66

Networked KOSèNKOS

• KOS are not used in isolation;• KOS may be used, re-used, and re-

purposed in web-based services;• KOS are used for:

– organizing, indexing, cataloging, and searching,AND

– learning, knowledge modeling, reasoning, etc.• NKOS need to be machine-processable,

machine-understandable– (more to discuss later today)

Page 67: Introductory Review of Current Knowledge Organization ...

M.L.Zeng @ ISSAI, Helsinki,2007 67

References

• Hodge, Gail (2000). Systems of Knowledge Organization forDigital Libraries: Beyond Traditional Authority Files. Washington,DC: Council on Library and Information Resources.http://www.clir.org/pubs/reports/pub91/contents.htmlhttp://www.clir.org/pubs/reports/pub91/pub91.pdf

• Hill, Linda, Buchel, Olha, Janee, Greg, and Zeng, Marcia L.2002. Integration of knowledge organization systems intodigital library architectures: In: Mai, Jens-Erik, et al. ed.:Advances of classification research, volume 13, proceedings of the13th ASIST SIG/CR Workshop, 17 November 2002Philadelphia PA, pp. 62-68.

• Koch, Traugott and Tudhope, Douglas. 2004. User-centredapproaches to Networked Knowledge OrganizationSystems/Services (NKOS): Background.http://www2.db.dk/nkos-workshop/#Background