Metadata and Tagging
description
Transcript of Metadata and Tagging
Metadata
What is Metadata?
Metadata is ‘data about data’ or information about information.
Michael Day (Metadata in a Nutshell) defines metadata as “standardised descriptive information about resources, including non-digital ones.” (e.g. book metadata, photograph metadata, etc.)
3 types of Metadata
1. Descriptive Metadata (Human Level): “content about the content” or metacontent.
discovery and identification of resources.
2. Structural Metadata (Machine Level): data about the containers of data.
3. Administrative Metadata (Human Level): information for managing a resource.
Why create Metadata?
Metadata makes your work readily available because:
Searchable: it’s easy to find.
Authority: it’s clear who created it.
Citation: it’s easy for others to cite your work in their publications.
Collaboration: it helps other people to build on your work rather than having to recreate it.
Efficiency: you save time and money, if everyone creates metadata.
Funding: you may be required to make your work readily available to others.
What does Metadata do?
Metadata is the key to ensuring the survival and accessibility of resources into the future.
Descriptive metadata facilitates the discovery of relevant information.
Resource discovery (e.g. library catalogs)
Organize electronic resources
Facilitate interoperability and resource integration
Provide digital identification
Support archiving and preservation
Interoperability
Metadata is the key to interoperability.
Interoperability is the ability of multiple systems with different hardware and software platforms, data structures and interfaces to exchange data with minimal loss of content and functionality.
Metadata promotes interoperability by making digital information understandable to both humans and machines.
Storing Metadata
Metadata can be stored separately or embedded in a digital object.
Storing separately:
Can facilitate search and retrieval
Stored in database and linked to the object described.
E.g. Semantic web
Storing metadata with object:
Ensures it wont be lost
Removes the problem of linking between data and metadata
Metadata and object are updated simultaneously
E.g. TEI
Digital Identification
Metadata schemes include elements, such as standard numbers, to uniquely identify the work or object to which the metadata refers.
Location of a digital object is given using:
a file name
URL (Uniform Resource Locator)
PURL (Persistent URL)…… preferred!
DOI (Digital Object Identifier)
Collaborative Metadata
A tag is a non-hierarchical keyword or term assigned to a piece of information (Wikipedia)
Effectively, a tag is a form of metadata.
O’Shea (2013) describes 3 types of tagging :
1. Personal
2. Algorithmic
3. Social
Hashtags
A hashtag is a universal, standardised metadata tag/mark.
Therefore, #uccmadah is a form of descriptive metadata.
You choose your tags on Twitter, Facebook, Wordpress, Youtube, etc.
But, Youtube, Delicious, and Wordpress algorithms also suggest tags for you.
HTML Metatags
These are all located in the <head> tag.
Example:
<head>
<meta name=“author” content=“Paul O’Shea”>
<meta name=“keywords” content=“metadata, metacontent, HTML, etc….”>
<meta name=“description” content=“MA DAH teaching Powerpoint slides”>
</head>
Social Tagging
Social tagging is also known as Folksonomies.
Folksonomy originates from folk + taxonomy
Taxonomy is the branch of science concerned with systematic classifications.
Ergo -> crowd-sourced tagging
See plugins for MediaWiki/Ruby on Rails for example!
Geotagging
The process of adding geo-spatially represented metadata to various media and resources.
Provides location-specific information
Examples include:
Digital cameras automatically geotag using GPS.
Facebook and Twitter mobile apps allow you to geotag tweets and status updates.
Archiving and Preservation
Special elements are required:
To track the lineage of a digital object.
Details of its physical characteristics.
Document its behaviour for future technologies.
Composing Metadata
Metadata schemes:
Sets of metadata elements designed for a specific purpose, e.g. describing a particular type of information resource.
• If the resource lives on the internet then one may use the URI to locate it and RDF to describe it.
Metadata Schemes
Definition/meaning of the metadata elements are the semantics of the scheme.
Value given to the metadata elements are the content.
Metadata schemes specify the names of elements and their semantics, and also the syntax rules for how elements and their content should be encoded.
The Dublin Core Metadata Element Set
Originated from 1995 workshop sponsored by OCLC and NCSA
Dublin, Ohio
Common vocabulary, like FOAF (Friend of a Friend – describing interpersonal networks)
Important because it is endorsed by IETF and ISO
See Euorpeana: http://www.europeana.eu/
The Dublin Core Metadata Initiative (DCMI)
Original objective:
To define a set of elements that could be used by authors to describe their own web resources.
15 Elements:
Title, Creator, Subject, Description, Publisher, Contributor, Date, Type, Format, Identifier, Source, Language, Region, Coverage and Rights.
http://dublincore.org/documents/dces/
Dublin Core
Designed to be simple and concise
To describe Web based documents
Minimalist view vs. Structuralist view (See Lumpers and Splitters)
Minimalists: minimum elements, simple semantics and syntax
Structuralist: Finer semantic distinctions and more extensibility for particular communities.
Dublin Core ExampleTitle=”Metadata Demystified”Creator=”Brand, Amy”Creator=”Daly, Frank”Creator=”Meyers, Barbara”Subject=”metadata”Description=”Presents an overview ofmetadata conventions inpublishing.”Publisher=”NISO Press”Publisher=”The Sheridan Press”Date=”2003-07"Type=”Text”Format=”application/pdf”Identifier=”http://www.niso.org/standards/resources/Metadata_Demystified.pdf”Language=”en”
Semantic Web
Focuses on semantic content rather than plain text content.
Allows for disambiguation of terms with the same name, but different meanings.
Evolved from limited, but simple, HTML meta tags to a complex ‘web’ of standards.
Some of the well established standards are Unicode, URI, XML, RDF, Web Ontology Language (OWL), etc.
Tim Berners-Lee http://www.ted.com/talks/tim_berners_lee_on_the_next_web.html
TEI
The Text Encoding Initiative http://www.tei-c.org/index.xml
International project to develop guidelines for marking up electronic texts such as novels, plays, poetry, correspondence, etc.
TEI Guidelines for Electronic Text Encoding and Interchange
Specify a header portion, embedded in the resource that consists of metadata about the work.
TEI Header can be used to record bibliographic information about the digital edition.