Post on 16-Dec-2015
Knowledge Organization in the Light of Intertextual Semantics
A Natural-Language Analysis of Controlled Vocabularies
Yves MARCOUXÉlias RIZKALLAH
GRDS – EBSIUniversité de Montréal
ISKO 2008 - Montréal 2
Overview
• Intertextual semantics (IS)
• IS's view of controlled vocabulaires (CVs)
• Example
• Consequences of IS view
• Future work
ISKO 2008 - Montréal 3
Intertextual semantics (IS)
• A way to envision how meaning is conveyed by information-bearing objects
• Based on natural language (NL)
• Not a semantics for natural language
• Rather a natural-language semantics for artificial information-bearing objects
• Goal: design "better" information-bearing objects (more effective and usable)
ISKO 2008 - Montréal 4
Scope of IS reflection
• Information-bearing objects– Primarily structured documents (e.g., XML)– Any data structure designed to hold
information in an information system• Ex.: database table / record / field
• Communication of meaning to human persons interacting with the object through any kind of interface
ISKO 2008 - Montréal 5
IS – Background (1/2)
• Introduced at Extreme Markup Languages (EML) 2006– valid XML documents only– modeler-author communication– further development (EML 2007)
• Applied to classical data structure for information exchange (SIGDOC 2007)
ISKO 2008 - Montréal 6
IS – Background (2/2)
• One in a series of semiotics-based approaches to improve systems design– Knuth (1984), De Souza (2005)
• One in a series of semantic frameworks for structured documents (XML, etc.)– Sperberg-McQueen et al. (2000), Renear et
al. (2002), Wrightson (2005)
ISKO 2008 - Montréal 7
Example
Facts about some US cities
City PopulationAnnual snowfall (inches)
Denver 850,000 23
Rochester 240,000 88
Palm Spring 48,000 0
ISKO 2008 - Montréal 8
Modeler prepares “peritext” segments
Element text-before text-after
facts-about-US-cities"Here are facts about some US cities."
empty
city " The city " "."
name "named " empty
population" has a population of "
" inhabitants "
annual-snowfall-in-inches
" and an annual snowfall of "
" inches"
ISKO 2008 - Montréal 9
Possible “semantic” (or IS) view for authors
Here are facts about some US cities. The city named Denver has a population of 850,000 inhabitants and an annual snowfall of 23 inches. The city named Rochester has a population of 240,000 inhabitants and an annual snowfall of 88 inches. The city named Palm Spring has a population of 48,000 inhabitants and an annual snowfall of 0 inches.
ISKO 2008 - Montréal 10
Example
• Raw XML document:
<billing> <amount-burial>1205.47</amount-burial> <payable-burial>D</payable-burial> <amount-cremation>788.00</amount-cremation> <payable-cremation>F</payable-cremation></billing>
ISKO 2008 - Montréal 11
IS view
ISKO 2008 - Montréal 12
IS specification of the model(peritexts prepared by modeler)
Element text-before text-after
billing "This section gives the billing information for this order. "
" End of billing information section."
amount-burial "Amount charged for the burial service: "
" canadian dollars; "
payable-burial "this amount is payable by: "" (D = Funeral director; F = Family)."
amount-cremation
"Amount charged for the cremation service: "
" canadian dollars; "
payable-cremation
"this amount is payable by: "" (D = Funeral director; F = Family)."
ISKO 2008 - Montréal 13
IS – Key ideas
• The semantic (IS) view is the reference interpretation and should convey, in NL, to humans, all the meaning intended / expected by the modeler
• The semantic (IS) view can (and should) contain hyperlinks to material not already known by target community of users, but necessary to make sense of the data structure
ISKO 2008 - Montréal 14
IS – Hypothesis (ISH-1)
• The IS view of a document is one of the most workable incarnation of its meaning– Wittgensteinian position
• The (human) task of interpreting the IS view of a document is representative of the task of "understanding" the document
ISKO 2008 - Montréal 15
IS – Consequences on design
• An intricate structure of the prose in the IS view, or a high number of hyperlink traversals indicate that the document (or data structure) is hard to understand– Gaps imply incomprehensible document!
• Design goals for modelers are thus:– Prose as simple as possible (but no more)– Low number of hyperlink traversals
ISKO 2008 - Montréal 16
IS – Notes
• The network of resources anchored (via hyperlinks) in the semantic view suggests an actual interpretation (sense-making) path, but does not impose it
• Any specific reading of a document yields more information than the IS view, but the IS view is considered a minimum for all readings, and thus, serves as a reference
ISKO 2008 - Montréal 17
Overview
• Intertextual semantics (IS)
• IS's view of controlled vocabulaires (CVs)
• Example
• Consequences of IS view
• Future work
ISKO 2008 - Montréal 18
Controlled vocabularies (CVs)
• Same scope as SKOS concept schemes:– Thesauri, classification schemes, subject
heading systems, subject indexes, taxonomies
• CVs are data structures– Designed by information professionnals– Populated by corpus analysts ("authors")– Used by document analysts to index
documents, and users to find documents
ISKO 2008 - Montréal 19
CVs in IS
• SKOS allows CVs to be expressed as XML documents– Eases the thought experiment of applying IS
• A CV can be expressed as a single XML document– Not as reductive as it sounds...– Example will concentrate on designer-author
communication
ISKO 2008 - Montréal 20
Overview
• Intertextual semantics (IS)
• IS's view of controlled vocabulaires (CVs)
• Example
• Consequences of IS view
• Future work
ISKO 2008 - Montréal 21
SKOS example<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:skos="http://www.w3.org/2004/02/skos/core#"> <skos:Concept rdf:about="http://www.my.com/#canals"> <skos:definition>Manmade waterway used by watercraft or for drainage, irrigation, or water power</skos:definition> <skos:scopeNote>A feature type category for places such as the Erie Canal</skos:scopeNote> <skos:prefLabel>canals</skos:prefLabel> <skos:altLabel>drainage canals</skos:altLabel> <skos:broader rdf:resource= "http://www.my.com/#hydrographic%20structures"/> </skos:Concept> <skos:Concept rdf:about= "http://www.my.com/#hydrographic%20structures"> <skos:prefLabel>hydrographic structures</skos:prefLabel> </skos:Concept></rdf:RDF>
ISKO 2008 - Montréal 22
IS view of same example
[… Introductory section for the whole CV: background, purpose, scope, etc. (omitted) …]
Section for concept with formal identifier: http://www.my.com/#canals This concept can be defined as Manmade waterway used by watercraft or for drainage, irrigation, or water power. It can be used as A feature type category for places such as the Erie Canal. The official accepted word or expression for referring to this concept is canals. Another word or expression commonly used to refer to this concept is drainage canals. canals are special cases of hydrographic structures.End of section
Section for concept with formal identifier: http://www.my.com/#hydrographic%20structures The official accepted word or expression for referring to this concept is hydrographic structures.End of section
ISKO 2008 - Montréal 23
IS specification
• Table of text-before and text-after for all SKOS elements and attributes
• Specified by designer (modeler) of CV before it is populated
ISKO 2008 - Montréal 24
Overview
• Intertextual semantics (IS)
• IS's view of controlled vocabulaires (CVs)
• Example
• Consequences of IS view
• Future work
ISKO 2008 - Montréal 25
IS specification
• Makes explicit the often hidden complexity of the CV model for users
• Is an opportunity for specifying extra semantics of the CV model, over and above SKOS semantics– Ex.: "is-a" instead of just "broader term"
• Cleary shows the cognitive price of using artificial codes, e.g., numbers instead of names to identify concepts
ISKO 2008 - Montréal 26
Extensions
• If SKOS extensions are used (e.g., custom relationships), IS specification is even more useful, because there are no "standard" interpretation of extensions
ISKO 2008 - Montréal 27
Overview
• Intertextual semantics (IS)
• IS's view of controlled vocabulaires (CVs)
• Example
• Consequences of IS view
• Future work
ISKO 2008 - Montréal 28
Future work (1/2)
• Development of IS framework– From intertexts to geometrized text– Application to interface / interaction design
• Application to CVs– IS analysis of other uses of CVs, e.g., for
indexing and searching– Work out an IS specification for a real CV and
experiment
ISKO 2008 - Montréal 29
Future work (2/2)
• Integration of IS in SKOS – IS-peritexts are not by refinement of SKOS
documentation properties– Rather domain-specific XML elements and/or
attributes
Thank you !
Questions ?
yves.marcoux@umontreal.ca