Metadata for the Web A Necessary Evil?
description
Transcript of Metadata for the Web A Necessary Evil?
![Page 1: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/1.jpg)
Metadata for the WebA Necessary Evil?
CS 431 – March 2, 2005Carl Lagoze – Cornell University
![Page 2: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/2.jpg)
“Metadata is data about data”
![Page 3: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/3.jpg)
Metadata is semi-structured data conforming to commonlyagreed upon models, providing operational interoperability
in a heterogeneous environment
![Page 4: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/4.jpg)
Are metadata and data distinguishable?
• Objectivity?• Intellectual property?• Structure?• Aboutness?
![Page 5: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/5.jpg)
Some untested hypotheses
• Metadata is useful for…– People– Machines
• More metadata is better• Simple metadata created by non-experts (Joe
Sixpack and his smarter friends) is useful
![Page 6: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/6.jpg)
Metadata Quality as function of Creator Expertise
Creator Expertise (High to Low)
Me
tad
ata
Qu
alit
y (
Lo
w t
o H
igh
)
Hoped
Actual?
![Page 7: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/7.jpg)
Some known facts
• Number and variety of metadata vocabularies will continue to increase
• The Tower of Babel is a franchise– There is not one common view of reality
• “The one thing I know about metadata is that it is expensive” (Bill Arms)
• “I hate metadata projects because they make every other digital library project more expensive” (Michael Lesk)
![Page 8: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/8.jpg)
Metadata Triage
![Page 9: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/9.jpg)
Metadata Takes Many Forms
resourcediscovery
documentadministration
rightsmanagement
contentrating
security andauthentication
archivalstatus
products andservices
databaseschemas
process controlor description
![Page 10: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/10.jpg)
Metadata Challenges
• Accommodate multiple varieties of metadata– community-specific functionality, creation,
administration, access– Lowest common denominator
• Tensions– functionality and simplicity – extensibility and interoperability– human and machine creation and use
![Page 11: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/11.jpg)
The fiction of classification
…there is no classification of the universe that is not fictional and
conjectural.
Jorge Luis Borges
![Page 12: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/12.jpg)
Lenses and Views
• All classification does and should provide a biased lens or view of reality
• Each view emphasizes certain characteristics and hides others
GeospatialRights
Museum
![Page 13: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/13.jpg)
Reality is Complex
Created by:George Castaldo
Created on:1994
Created by:Leonardo da Vinci
Created on:1506
Relationship?
![Page 14: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/14.jpg)
Objects are Related
IFLA Entity Model
![Page 15: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/15.jpg)
Haven’t we done metadata already?
![Page 16: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/16.jpg)
What’s wrong with this model?
• Expensive– Complex (even for its original goal?) – Professional intervention (assumes single community
of expertise)
• Monolithic– One size fits all approach– Reflects its centralized system origins
• Bias towards physical artifacts– Fixed resources– Incomplete handling of resource evolution and other
resource relationships
• Anglo-centric
![Page 17: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/17.jpg)
Why hasn’t metadata worked on the Web?
• Its all about trust• People are lazy• Metadata is hard• No perceived benefit
– “Reverse tragedy of the commons”
• No agreement on one way to describe things
• “Metacrap” - http://www.well.com/~doctorow/metacrap.htm
![Page 18: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/18.jpg)
The fifteen Dublin Core Elements
Creator Title Subject
Contributor Date Description
Publisher Type Format
Coverage Rights Relation
Source Language I dentifi er
http://dublincore.org/usage/terms/dc/current-elements/http://dublincore.org
![Page 19: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/19.jpg)
A Pidgin for Digital Tourists
• Metadata is language• Dublin Core is a small and simple language -- a
pidgin -- for finding resources across domains.• Speakers of different languages naturally
"pidginize" to communicate– E.g., tourists using simple phrases to order beer
("zwei Bier bitte" "dva pivo prosim" "biru o san bai kudasi"...)
• We are all "tourists" on the global Internet.
![Page 20: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/20.jpg)
What is the Dublin Core (1)
• A simple set of properties to support resource discovery on the web (fuzzy search buckets)?
DomainIndependent
view
![Page 21: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/21.jpg)
What is Dublin Core (2)?
• An extensible ontology for resource desciption?
Gre
ate
r Fun
ction
ality
&
Cost
![Page 22: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/22.jpg)
Progressive Metadata Models:Drill-Down Searching Paradigm
• Moving along a specificity spectrum• Inter-domain vs. intra-domain terms, models,
query mechanisms
![Page 23: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/23.jpg)
Drill-down search paradigm
DomainIndependent
view
DomainSpecific
View
![Page 24: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/24.jpg)
What is the Dublin Core (3)?
• A cross-domain switchboard for interoperable metadata?
Switchboard
DublinCore
MARC
INDECSIMS
![Page 25: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/25.jpg)
Dublin Core Qualifiers
• From fuzzy buckets to more specific description
• Model of “graceful degradation”– Support both simplicity and specificity– Intra-domain and inter-domain semantics
![Page 26: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/26.jpg)
Varieties of qualifiers: Element Refinements
• Make the meaning of an element narrower or more specific.
• Narrowing implies an is a relationship – a "date created“ is a "date“– an "is part of relation“ is a "relation“
• If your software does not understand the qualifier, you can safely ignore it.
![Page 27: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/27.jpg)
Varieties of Qualifiers: Value Encoding Schemes
• Says that the value is– a term from a controlled vocabulary (e.g., Library of
Congress Subject Headings)– a string formatted in a standard way (e.g., "2001-05-
02" means May 3, not February 5)
• Even if a scheme is not known by software, the value should be "appropriate" and usable for resource discovery.
![Page 28: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/28.jpg)
A Grammar of Dublin Core
• http://www.dlib.org/dlib/october00/baker/10baker.html
• By design not as subtle as mother tongues, but easy to learn and extremely useful in practice
• Pidgins: small vocabularies (Dublin Core: fifteen special nouns and lots of optional adjectives)
• Simple grammars: sentences (statements) follow a simple fixed pattern...
![Page 29: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/29.jpg)
Example Dublin Core statements
• Resource has Title 'Grammar of Dublin Core'.• Resource has Creator 'Tom Baker'.• Resource has Subject 'Metadata'.• Resource has Relation http://foo.org/file.htm.
![Page 30: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/30.jpg)
Resource has property
DC:CreatorDC:TitleDC:SubjectDC:Date...
X
implied subject
impliedverb
one of 15properties
property value(an appropriateliteral)
![Page 31: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/31.jpg)
Resource has property
DC:CreatorDC:TitleDC:SubjectDC:Date...
X
implied subject
impliedverb
one of 15properties
property value(an appropriateliteral)
[optional qualifier]
[optional qualifier]
qualifiers(adjectives)
![Page 32: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/32.jpg)
Resource has Date "2000-06-13"Revised
ISO8601
Resource has Subject "Languages -- Grammar"LCSH
![Page 33: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/33.jpg)
Dumb-Down Principle for Qualifiers
• The fifteen elements should be usable and understandable with or without the qualifiers
• Qualifiers refine meaning (but may be harder to understand)
• Nouns can stand on their own without adjectives
• If your software encounters an unfamiliar qualifier, look it up -- or just ignore it!
• "has a“ relations break the model– E.g., a creator has a hair color
![Page 34: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/34.jpg)
Resource has Date "2000-06-13"Revised
ISO8601
Resource has Subject "Languages -- Grammar"LCSH
Test for “good““ qualifiers:cover and ask: -- Does the statement still make sense? -- Is it still correct?
![Page 35: Metadata for the Web A Necessary Evil?](https://reader035.fdocuments.in/reader035/viewer/2022062518/56814624550346895db3312c/html5/thumbnails/35.jpg)
Resource has subjectaudience
Resource has creatoraffiliation
“Incorrect” Qualification
“Cornell University”
“pre-schoolers”