Introduction to Metadata: Overview and Guidelines Using the Dublin Core Metadata Schema Amelia...

Post on 27-Dec-2015

253 views 5 download

Tags:

Transcript of Introduction to Metadata: Overview and Guidelines Using the Dublin Core Metadata Schema Amelia...

Introduction to Metadata:Introduction to Metadata:

Overview and Guidelines Using the Overview and Guidelines Using the Dublin Core Metadata SchemaDublin Core Metadata Schema

Amelia Breytenbach

Metadata Specialist

Amelia.breytenbach@up.ac.za

Institutional Repository Workshop

1-3 April 2009

Overview

• What metadata is• Types of metadata • What does metadata do and why we use it• Metadata standards• Dublin Core Metadata Standard• Encoding schemes• Metadata creation• Metadata documentation

Definition of Metadata

• Metadata describes other data.– It provides information about a certain item's

content, i.e. an image may include metadata that describes the picture size, colour depth, image resolution and date created

– A text document's metadata may contain information about the document’s length, the author, when the document was written and a short summary

Source: The Tech Terms Computer Dictionary http://www.techterms.com/definition/metadata

What Is Metadata?

• Standardized descriptions of resources that aid in the discovery and retrieval of resources, particularly in reference to information about electronic, or digital, material

• Describing individual files, single objects or complete collections

• Traditional library cataloging is a form of metadata and MARC 21 and the AACR2 used with it are metadata standards

Types of Metadata

• Descriptive

• Structural

• Administrative or technical

– Preservation– Rights management

title, author, extent, subject, keywords

unique identifiers, page numbers, special features (table of contents, indexes)

file formats, scanning dates, file compression format, image resolution

Archival information Ownership, copyright,

license information

What Does Metadata Do?

Metadata• is the key to ensuring that resources will survive

and continue to be accessible into the future• is searchable and aids the identification and

retrieval of resources• helps the end user to do accurate searching and

to evaluate a resource• types also assists in managing, maintaining and

preserving digital collections• facilitate interoperability• supports archiving, security and authentication

of digital resources

Why Use Metadata?

• Metadata provides the essential link between the information creator and the information user

• We can ensure that this objective is met by using metadata in accordance with international standards

Metadata Standards

• Data structure standardsStandardized sets such as Dublin Core, VRA and MODS

• Data content standardsRules or guidelines for input

• Data value standardsLists of allowed values for an element

• Data format or encoding standardsHow to encode the metadata

• Data presentation standardsDisplay of the metadata

Dublin Core as Structure Standard for DSpace

Qualified Dublin Core Metadata Element Set• Mandatory elements in DSpace

Title, Language and Date element• DSpace refinements

Additional metadata qualifiers for some DC elements

• System generated metadata

Dublin Core Metadata Initiative (DCMI)

• An organization with the aim to promote more intelligent resource discovery through the widespread adoption of interoperable metadata standards and the development of specialized metadata vocabularies for describing resources

• DCMI provides an international forum for identifying problems, to develop understanding and proposing solutions

• Dublin Core website: http://dublincore.org/

Characteristics of Dublin Core

• The DC elements are – simple to understand and apply– subject independent with commonly

understood terminology– optional and repeatable– international in scope– extensibility

Dublin Core Metadata Element Set

• Unqualified

For coarse-grained discovery of resources• Qualified

– For richer descriptions to enable more refined resource discovery

– Most digital library software uses qualified DC • “Dumb-down” principle

– Collapse a refinement back into a core element– Unqualified DC required for sharing metadata via

the Open Archives Initiative

Dublin Core Metadata Element Set(Source: Miller, Steven J., 2007. Metadata for digital collections: an online workshop.

Qualified Dublin Core in DSpace

Dublin Core Qualifiers

Two categories of qualifiers:• Element refinement

Make the meaning of an element narrower or more specific. A refined element shares the meaning of the unqualified element, but with a more restricted scope

• Encoding scheme

Identify schemes that aid in the interpretation of an element value. These schemes include controlled vocabularies and formal notations

Dublin Core Element Refinements

• Title Alternative • Creator -• Contributor -• Publisher -• Description Abstract, Table of Contents• Subject -• Coverage Spatial, Temporal• Format Extent, Medium• Type -• Date Created, Available, Modified, Valid, Issued• Language -• Relation Is Version of, Has Version, Is Replaced

By, Replaces, Is Required By, Requires, Is Part Of, Has Part, Is Referenced By, References, Is Format Of, Has Format, Conforms To

• Source -• Identifier -• Rights -

Encoding scheme for a element value

Qualifiers for

dc.date element

Dublin Core Encoding Schemes

• Getty Thesaurus of Geographic Names Online (TGN)

http://www.getty.edu/research/conducting_research/vocabularies/tgn/index.html

• Art and Architecture Thesaurus Online (AAT)

http://www.getty.edu/research/conducting_research/vocabularies/aat/ 

• Library of Congress Name Authority File (LCNAF)

http://authorities.loc.gov/

• Library of Congress Thesaurus for Graphic Materials (LCTGM)

http://www.loc.gov/rr/print/tgm1/

Creator, Publisher and Subject elements

Subject element

Coverage.spatial field

Subject element

Dublin Core Original vs Digital Resource

• 1 : 1 principle• Single metadata record with mix elements for the

original and digital object• Use repeatable Dublin Core elements in the same

metadata description• Use locally-defined elements and map to a Dublin

Core element

Date Original

Date Digital

DC Date element

Example

Element Original Painting Digital Image

Title Mona Lisa Mona Lisa

Creator Leonardo da Vinci Leonardo da Vinci

Date 1500 2002-10-30

Format.medium Oil painting Image/JPEG file

Type Still Image Still Image

Identifier No. 779 [museum inventory number]

2002_0054.jpg

Format.extent 77 X 53 cm 158KB

Rights Not in copyright © [owner digital collection]

Metadata Creation

• Natural metadata is found in the source document and created by the researcher or submitter • supports discovery of resources• includes the author’s name, date, title

• Added metadata is added by an metadata editor or by software • supports resource selection• includes subject terms, abstracts and rights

metadata

Metadata Creation (cont.)

• Metadata as a view of the resource• There is no one-size-fits-all metadata record • Metadata for the same thing is different

depending on collection, use and audience

Metadata Creation (cont.)

• Rivers of Europe

Elbe river with passenger boats, Dresden

• European Opera Houses

“Semperoper” opera house in Dresden, Germany

• Cities of Europe

River scene in Dresden, Germany

• Bridges of the World

Augustus Bridge over the Elbe River, Dresden, Germany

Construct a title for this image if the theme of the digital collection was:

Detailed vs Simple Metadata Descriptions

• Detailed metadata descriptions – may improve searching precision – require higher investment in creation of metadata – make it more difficult to promote consistency in

creation of metadata• Simple descriptions

– are easier and less costly to generate – more effort on the part of searchers to identify

most relevant results – improve probability of cross-disciplinary

interoperability

Metadata Design and Documentation

• Metadata registry / Best practice guide / Data dictionary / Application profile• provides standardized information for the

definition, identification, and use of each data element

• ensure that a metadata schema and data elements in use by an organization can be applied consistently within the organization or community, reused by other communities, and interpreted by computer applications and human users

Value of Metadata Documentation

• Improve discovery of resources.• Increase interoperability across all collections

created by an institution• Increase interoperability with other digital

libraries participating in the Open Archives Initiative

• Inform users on the digital object structure and the software needed to view the digital resource

• Ensure quality control for metadata records• Assist with management and long-term

preservation of digital files

Data Dictionaries (DDs)

• A table with applications of the metadata standard applicable to a specific collection or digital project or type of material

• Lists of local metadata elements• Mapping to Dublin Core• Specifications such as the use of controlled

vocabulary• Examples and comments about the use of each

element

Theses

Books

Images

Best Practice Guides

• Guidance and documentation to describe and standardize the use of metadata elements that best support a community's needs

• Provide guidelines and decisions for metadata creators

• Explanation of metadata elements, terms and concepts

• Examples of the use of the different elements• CDP Metadata Working Group

Dublin Core Metadata Best Practices, Version 2.1.1, Sept. 2006 http://www.cdpheritage.org/cdp/documents/cdpdcmbp.pdf

Metadata Registries

• A metadata registry is a central location in an organization where metadata definitions are stored and maintained in a controlled method (Wikipedia)

• Protected area• Stores data elements• Stores the meaning of a data element• Defines how the metadata is represented

Closing Remarks

“Metadata” means many different things: • It involves applying traditional library principles to new

environments • Good metadata practitioners use fundamental

cataloging principles in non-MARC environments• Documentation is important • Good metadata promotes good digital collections• There is always more to learn

References

• Taylor, Chris. (2003) An Introduction to metadata http://www.library.uq.edu.au/iad/ctmeta4.html

• Technical Advisory Service for Images (TASI). Metadata and digital images. http://www.tasi.ac.uk/advice/delivering/metadata.html

• Technical Advisory Service for Images (TASI). Controlling your language – links to metadata vocabularies http://www.tasi.ac.uk/resources/vocabs.html

• Hodge, Gail. (2001) Metadata made simpler.• Smith, MacKenzie. (2003) Dspace: an open source

dynamic digital repository. D-Lib Magazine, January 2003. http://www.dlib.org/dlib/january03/smith/1smith.html

• Disa Workshop: Digital collections management, University of KwaZulu-Natal, 2004.

References (cont.)

• NISO Framework Advisory Group. A Framework of Guidance for Building Good Digital Collections. 2nd ed. Bethesda, MD: National Information Standards Organization, 2004. http://www.niso.org/framework/framework2.html

• Xia, Jingfeng. Personal name identification in the practice of digital repositories. Electronic library and information systems. Vol. 40, no. 3, 2006. pp. 256-267.

• Miller, Steven J. Metadata for digital collections: an online workshop. University of Wisconsin-Milwaukee, School of Information Studies, 2007.

• CDP Metadata Working Group. Dublin Core Metadata Best Practices, Version 2.1.1, Sept. 2006 http://www.cdpheritage.org/cdp/documents/cdpdcmbp.pdf

Thank you!

Amelia Breytenbach

Metadata specialist

University of Pretoria

amelia.breytenbach@up.ac.za