Post on 01-Jan-2017
Dublin Core for MuseumsDay 1
Paul MillerUK Office for Library & Information Networkingp.miller@ukoln.ac.uk
Thomas HofmannAustralian Museums On-Linethomash@amol.org.au
CIMIJohn Perkins jperkins@cimi.org
Overview for Thursday March 25 Introduction to Metadata Introducing the Dublin Core CIMI DC Guidelines - Dublin Core for Museums Break DC for museums continued... Lunch Practicalities of Implementing DC Break Introduction to MICI
What’s the Problem ? Need to serve a Web audience
Demand for content Uncertain quality Expectations for rapid easy access
Need to be visible on the Web Two million web sites Half a billion addressable pages
Many communities with the same problem
What’s the Problem ? Manage and organise interconnected data
Different types Different repositories Packages
Interoperate with other communities Interoperate with other applications Need a way to:
Express meanings in rich and complex data Express the structure of our data Encode the transfer of data
What’s the Solution ?
Communities address their own needs
Do so in a way that works across communities
Standards based
Collaborative
What is a Community?
Libraries
MARC AACR2
A resource description community is characterised by agreed semantic, structural and syntactic conventions for exchange of descriptive information
Based on a slide by Stu Weibel
Museums
SPECTRUM MICI
ScientificDatabases Museums
GeoLibraries
‘InternetCommons’
HomePages Commerce
Whatever...
Based on a slide by Stu Weibel
Communities working together
Metadata
Museums
Metadata
Metadata
Metadata
Based on a slide by Stu Weibel
Communities working together
Metadata
What is Metadata?
Meaningless jargon or
a fashionable term for what we’ve always done or
“a means of turning data into information” and
“data about data” and
the name of a film director (‘Luc Besson’) and
the title of a book (‘The Lord of the Flies’).
What is Metadata?
Metadata exists for almost anything People Places Objects Concepts Databases Web pages
What is Metadata?
Metadata fulfils three main functions: description of resource content
“What is it?”
description of resource form “How is it constructed?”
description of issues behind resource use “Can I afford it?”.
What is Metadata?
Many structures have evolved at different levels, and to meet different requirements...
MICI
For human communication we need...
SemanticInteroperability
StructuralInteroperability
SyntacticInteroperability
“Let’s talk English”Standardisation ofcontent
Standardisation ofform
“Here’s how to make a sentence”
Standardisation ofexpression
“These are the rulesof grammar”
“cat milk sat drank mat ”
“Cat sat on mat. Drankmilk.”
“The cat sat on the mat.It drank some milk.”
Challenges
Many flavours of metadata which one do I use?
Managing change new varieties, and evolution of
existing forms Tension between functionality and
simplicity, extensibility and interoperability
Functions, features, and cool stuff Simplicity and interoperability
Opportunities
Introducing the Dublin Core
An attempt to improve resource discovery on the Web now adopted more broadly
Building an interdisciplinary consensus about a core element set for resource discovery simple and intuitive cross–disciplinary international flexible.
Introducing the Dublin Core
15 elements of descriptive metadata All elements optional All elements repeatable The whole is extensible
offering a starting point for semantically richer descriptions
Interdisciplinary libraries, museums, government, education...
International available in 20 languages, with more on the way.
Introducing the Dublin Core
TitleTitle CreatorCreator SubjectSubject DescriptionDescription PublisherPublisher ContributorContributor DateDate TypeType
FormatFormat IdentifierIdentifier SourceSource LanguageLanguage RelationRelation CoverageCoverage RightsRights
http://purl.org/dc/
Extending DC (semantic refinement)
CreatorFirst Name
Surname Contact Info
Affiliation
Based on a slide by Stu Weibel
Improve descriptive precision by addingsub–structure (subelements and schemes)
Greater precision = lesser interoperability
Should ‘dumb down’ gracefully
Element qualifier Value qualifier
Extending DC (a modular approach)
Modular extensibility... additional elements to support local needs complementary packages of metadata
…but only if we get the building blocks right
Description Archival Management
Terms & Conditions
Based on a slide by Stu Weibel
Extending DC?
DC offers a semantic framework through use of further substructure,
meaning can often be clarified
<Creator> “John”John Inc. ?John xyz ?xyz John ?
<Creator> <fore name> “John” John Inc.John xyzxyz John.
Extending DC?
DC offers a semantic framework Use of domain–specific schemes greatly
increases precision
<Coverage> “Washington”Washington State ?Washington DC ?Washington monument ?
<Coverage> <TGN> “Washington” Washington StateWashington DCWashington monument
“North and Central America, United States, Washington”
http://gii.getty.edu/tgn_browser/
Dublin Core originally designed with electronic resources in mind Physical resources are fundamentally
different Issues of surrogacy become more important Genre, Type, and Format models vary greatly Difficult to remember what is being described, and
which characteristics of the resource and its surrogates are ‘correct’.
Dublin Core in the physical world
Aspects of the real world are keyto much of what museums do Physical objects have dimensions
23 x 46 cm 12 x 52 x 18 in 18.6 cm3
823 pages
Physical objects have a form oil on canvas Tadcaster limestone stainless steel.
Introducing Physical Objects
Physical objects change over time
constructed between AD524 and 873
repaired in AD1270 incorporated into ornamental arch in AD1320
Physical objects move cast in Beijing used in Shanghai taken to Hong Kong on display in Macau.
Introducing Physical Objects
Physical objects are associated with people written by William Shakespeare acquired by Lord Elgin decreed by the Emperor Hadrian associated with Prince Charles Edward
Stuart
Physical objects are contextualised fired at the Battle of Trafalgar carried on Apollo 11 from the moon printed on the first printing press salvaged from the Titanic.
Introducing Physical Objects
Museum objects, whether original orsurrogate, are normally part of a collection
Collections may be ‘real’... the Sutton Hoo hoard
the Terracotta Warriors
...an aspect of the process by which objects enter the museum... the Burrell Collection
Solomon Guggenheim’s art collection
…or simply practical coins at the British Museum
the Tate Gallery’s collection of works by Da Vinci.
Introducing Collections
Many of the resources we describe are, in reality, surrogates for something else
a photograph of King Tutankhamen’sdeath mask
a photograph of a statue of George Washington
a film of President Kennedy’s assassination a sound recording of Neil Armstrong’s “One
small step for man…” speech on the moon a copy of the Mona Lisa a model of the Great Wall of China a reproduction of the Terracotta warriors.
Introducing Surrogacy
Many of the resources we describe are, in reality, surrogates for something else
we need to be clear whether we aredescribing the resource or its surrogate
the sculptor of a statue is often not the person who made its photographic surrogate
the model of the Forbidden City is unlikely to have been created at the same date as the Forbidden City itself
the format of a computer image of the Mona Lisa (image/jpeg ?)is not the same as the format of the original painting (oil on canvas ?).
Issues of Surrogacy
Museums need to describe real objectsand surrogates in a similar manner
guidelines/standards therefore need to encompass both, despite their differences
Resource descriptions will often be drawn from existing collection management systems in the first instance, rather than created afresh
guidelines therefore need to respect existing practices within established systems
There is often no ‘right’ answer so practices need to allow for approximate dates, multiple
possible creators, etc.
Other Museum Issues
The broader Dublin Core community is tackling some of the problems relevant to museums
Their work on the ‘1:1 Principle’ is especially useful in resolving museum issues over original versus surrogate and item versus collection:
each Dublin Core ‘record’ should describe only one resource, whether surrogate or original. Associated resources should be linked together by means of the Relation element in Dublin Core.
Introducing the 1:1 Principle
In a record describing a photo of the Mona Lisa on a web page, for example…
Leonardo da Vinci is not the creator of the image The image was not created during the Renaissance …but you might include these as Subject terms, and you could
usefully provided a link to the record describing the real painting via Dublin Core’s Relation element
Equally, in describing the painting itself… http://www.louvre.fr/…/monalisa.jpg is not the Identifier of the
painting but you might link to this image via Relation, just to show
people what the painting looks like.
Introducing the 1:1 Principle
In describing museum objects, it is often most useful to first decide whatyou are describing and why, rather thanbeginning with ‘who made it’ and ‘what is it called’, as is often the case with books
if you know you’re describing a surrogate of the Mona Lisa, then you know Leonardo da Vinci is not the Creator; whoever made the surrogate is
if you know you’re describing a collection of 20th century paintings, then you know that Picasso, Hockney et al are not the Creators; the collector is.
The primacy of ‘Type’
if you know you’re describing the Sutton Hoo helmet, then the fact that it was added to a particular museumcollection in 1939 perhaps doesn’t matter;that information is better placed in the collection record
if you know you’re describing a natural specimen, then perhaps it has no Creator; there may be a ‘creator’ associated with its identification or collection, though.
The primacy of ‘Type’
In applying Dublin Core to museums, we aremaking certain basic assumptions, many of which were tested by CIMI
DC is appropriate for use in describing both physical and digital resources
DC is easy to learn and simple to use Information can be meaningfully and efficiently extracted from
existing museum systems in order to populate DC records the creation of a DC record to describe a museum object is
cost–effective, and aids the discovery of resources more than simply allowing access to the underlying Collection Management system might.
Dublin Core for Museums: Assumptions
Practicalities of Implementing Dublin Core
Paul MillerUk Office for Library & Information Networkingp.miller@ukoln.ac.uk
Thomas HofmannAustralian Museums On-Linethomash@amol.org.au
Overview Creation and Maintenance Harvesting and Distribution Retrieval Implementation Models Case Study
Dublin Core - Refresher
15 simple elements Focus on Resource Discovery not Resource
Description One Dublin Core record per resource Interoperable across communities Can be easy populated from existing
databases Can be formatted in XML/ RDF or HTML
When should I use Dublin Core?
You have a rich standard, need simpler one You want to disclose your data to other
communities using commonly understood semantics
You want to provide unified access to databases with different underlying schemas
You need core description semantics and don’t feel compelled to invent them anew
Considerations
Harvesting/ Distributiontools
Creation and Maintenancetoolseducate
Retrievaltoolsconsensusinterface design
Creating and Maintaining Dublin Core Metadata
Encoding Dublin Core HTML
Unqualified Easy
Qualified Overloaded Content (HTML 3.2) Additional Attribute (HTML 4)
RDF Based on XML
Sophisticated More complex
Encoding Dublin Core - Unqualified
<HEAD><META NAME="DC.TITLE"
CONTENT="My Web Page">
<META NAME="DC.Subject"
CONTENT="Computers,Metadata">
</HEAD>
Encoding Dublin Core - Qualified (HTML 3.2)
<HEAD><META NAME="DC.Subject"
CONTENT="(SCHEME=AAT)(LANG=EN) Statue, Granite">
</HEAD>
Encoding Dublin Core - Qualified (HTML 4)
<HEAD><META NAME="DC.Subject"
SCHEME="AAT"
LANG="EN"
CONTENT="Statue, Granite">
</HEAD>
Encoding Dublin Core - Sub-Elements
<HEAD><META NAME="DC.Date.Created"
CONTENT=" (SCHEME=ISO8601)
1999-03-01">
<META NAME="DC.Date.Modified"
SCHEME="ISO8601"
CONTENT="1999–03–25">
</HEAD>
Encoding Dublin Core - RDF
...<?xml:namespace href="http://iso.ch/8601/" as="ISO"?>
<RDF:RDF>
<RDF:Description …>
<DC:Date>
<RDF:Description>
<ISO:date>1999–03–25</ISO:date>
</RDF:Description>
</DC:Date>
<RDF:Description>
</RDF:RDF>
Example Tool: DC Dot
http://www.ukoln.ac.uk/metadata/dcdot/ Semi-automated generation of Dublin Core Cut and past into document Conversions to HTML, SOIF, XML, WHOIS++,
USMARC, GILS
Example Tool: DC Dot
Screenshot of http://www.ukoln.ac.uk/metadata/dc-dot/
Example Tool: DC Dot
Screenshots of DC Dot output
Example Tool: Reggie
http://metadata.net Generic creation tool for any metadata schema
published to metadata.net Currently supports: Dublin Core in 5 languages Syntax: HTML META tags (V3.2 and 4.0), RDF
Example Tool: Site Generator
http://www.dstc.edu.au/RDU/MetaWeb/ Tool which parses local web site and automatically
creates Dublin Core metadata Syntax: HTML JAVA based tool which requires JDK 1.1
Further Information - Creation and Maint. Metadata Creation Tools
General METADATA PAGE AT UKOLNhttp://www.ukoln.ac.uk/metadata/software-tools/METAWEBhttp://www.dstc.edu.au/RDU/MetaWeb/TagGen SEhttp://www.hisoftware.com/fact_sheetcc.htm
User GuidesOfficial User Guide for Simple Dublin Corehttp://purl.org/dc/core/documents/working_drafts/wd-guide-current.htm
CIMI Guide to Best Practice: Dublin Core
Harvesting and Distributing Dublin Core Metadata
Harvesting / Distribution Tools
Z39.50 Gateway Metadata Harvester Full-text Search Engine
Resources Indexing, harvesting tools
http://www.searchenginewatch.com/http://www.searchtools.com/http://www.ukoln.ac.uk/metadata/software-tools/http://www.dstc.edu.au/RDU/MetaWeb/
Z39.50http://www.ilrt.bris.ac.uk/discovery/z3950/resources/http://www.ukoln.ac.uk/dlis/z3950/resources/
Searching and Retrieving Dublin Core Metadata
Retrieval
Tools HTML - search forms HTML - predefined queries Z39.50 clients/ Java applets Standalone applications
Interface design Assist users:
-help them to understand what they are looking for-give them an idea what terminologies you are using-use commonly understood design language
Bringing it all together:Implementation Models
Implementation Models
Harvesting DC into a repository (database) Distributed Database Search Full-text indexing with metadata extraction
Implementation Models
Harvesting DC into a repository (database)
HTML
XML
Other types
Repository HarvesterQuery
Dynamic document creation from database
retrieve resource
Implementation Models
Distributed Database Search
Z39.50 Server
Z39.50 Server
Z39.50 Server
Z39.50 GatewayQuery
retrieve resource
Implementation Models
Full-text indexing with metadata extraction
IndexerIndex DBQuery
HTML
XML
Other types Dynamic document creation from database
retrieve resource
Questions before implementation Do I really need Dublin Core? What is my budget? What type of resources do I want to describe? Which encoding format for which resource? Do I have community support? Can I provide creation tools?
Challenges of implementing Dublin Core Intellectual
Education of information creators Community consensus Resistance against sharing information
Technical Efficient tools Infrastructure
Economical Automatic generation vs. manual creation Cost of training Cost of tools
Dublin Core for Masses?
Dublin Core for the massesWhy Dublin Core hasn’t hit the consumer market yet
No killer application Lack of standardisation No support in public search engines No support in mass market applications Non transparent applications Inefficient handling in HTML
Further Information Projects
Official Dublin Core web sitehttp://purl.oclc.org/dc/projects/index.htm
Mailing listsDublin Core Implementors workgroup Mailing list
http://www.mailbase.ac.uk/lists/dc-implementors/
Case Study: AMOL
Case Study AMOL (1) Gateway to Australian Museums and Galleries Initial idea: One central access point for all Australian collections Creation of AMOL standard record for object data due to lack of
common standards 8 basic field with focus on resource discovery and easy deployment
from within existing databases Fields: Object Title, Object Name, Creator, Description, Item ID,
KeySearchTerms, Date/DateRange, Associated Places
Case Study AMOL (2)
AMOL search/ system architecture - current system
User queries searchengine and gets recordsdelivered to web browser
Remote web serverstoring HMTL documents
Legacy DB
HTML documents
Mapped metadata exported
AMOL index server
Case Study AMOL (3)
Data and technology related Lack of consistent use of controlled vocabularies, quality of
data recorded Performance of indexing software, lack of metadata support in
public search engines high administration efforts
Intellectual Users have problems with “empty text box” approach Limited information in record to see context with larger picture
General Large institutions: bureaucratic machinery, complex collection
systems designed without interoperability in mind Small institutions: concerned about security issues,
fear of larger institutions
Lessons Learned
Case Study AMOL (4)
New resource types: Information about institutions, Images, Video, Audio, general HTML pages - goes beyond capabilities of standard AMOL record
Need to provide easier access for users New cross community projects require interoperable
metadata standards for cross domain searching Strong move in Australia towards Dublin Core based
metadata schemas driven by government Strong move towards interpretation of objects through
stories
Search Architecture and extended AMOL metadata standard
New perspectives
Case Study AMOL (5)
NEW AMOL search/ system architecture
User queries searchengine and gets recordsdelivered to web browser
AMOL index server
Remote web serverProviding dynamic accessto ODBC databases
Legacy databases
Textual resources
AV resources
Information mapped to DC based metadata plus index text, images
Case Study AMOL (6)
Future Directions Implementation of RDF for dynamically served
databases and text style resources Consensus of community: Metadata Forum Further education of users: Metadata
Workshops Creation of multi-type metadata schema
based on Dublin Core Creation of mapping tools for easier database
implementation
Case Study AMOL (7)
Recommendations Prepare good user guides Run workshops and educate museum professionals Get consensus from community Plan with interoperability in mind Evaluate tools and plan for future additions
Biggest Problem still remaining: what is the benefit to the individual institution other
than being interoperable for networked resources
Dublin Core for Masses?
Dublin Core for the massesWhy Dublin Core hasn’t hit the consumer market yet
No killer application Lack of standardisation No support in public search engines No support in mass market applications Non transparent applications Inefficient handling in HTML
Further Information Projects
Official Dublin Core web sitehttp://purl.oclc.org/dc/projects/index.htm
Mailing listsDublin Core Implementors workgroup Mailing list
http://www.mailbase.ac.uk/lists/dc-implementors/
http://www.cimi.org/
For Machine Communication we need..
SemanticInteroperability
StructuralInteroperability
SyntacticInteroperability
“Let’s talk Resource Description”
Standardisation ofcontent
Standardisation ofform
“Lets use MICI”
Standardisation ofexpression
“Here’s how to say it in HTML”
“Creator, Publisher..,”
“Field # 1 Element Name
“<Meta name= Element Name= “….”>”