Module 3b: Metadata IMT530: Organization of Information Resources Winter 2007 Michael Crandall.

22
Module 3b: Metadata IMT530: Organization of Information Resources Winter 2007 Michael Crandall
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    219
  • download

    0

Transcript of Module 3b: Metadata IMT530: Organization of Information Resources Winter 2007 Michael Crandall.

Page 1: Module 3b: Metadata IMT530: Organization of Information Resources Winter 2007 Michael Crandall.

Module 3b: Metadata

IMT530: Organization of Information Resources

Winter 2007

Michael Crandall

Page 2: Module 3b: Metadata IMT530: Organization of Information Resources Winter 2007 Michael Crandall.

IMT530A- Organization of Information Resources 2

Recap

• Information systems have two inputs– User needs– Information objects

• Representing those inputs effectively inside the system enables the output of objects (or pointers to objects) matching user needs

• Representation is accomplished through developing a model that describes the needs and the objects

• This is expressed through metadata and controlled vocabularies, which are applied to the user needs and information objects

• So what is metadata anyway, and how is it created and used?

Page 3: Module 3b: Metadata IMT530: Organization of Information Resources Winter 2007 Michael Crandall.

IMT530A- Organization of Information Resources 3

Ways to Express Meaning: for people & machines

General Logic

Glossaries / Controlled Vocabularies Data and Document Metamodels

Formal Knowledge Bases & InferenceInformal Taxonomies and Thesauri

Terms Thesauri

formalTaxonomies

Frames(OKBC)

Data Models(UML, STEP)

Restricted Logics

(OWL, Flogic)

Principled, informal

taxonomies

ad hoc Hierarchies

(Yahoo!)structured Glossaries

XML DTDs

Data Dictionaries

(EDI)

‘ordinary’Glossaries

XML Schema

DB Schema

Michael Uschold | The Boeing Company

Page 4: Module 3b: Metadata IMT530: Organization of Information Resources Winter 2007 Michael Crandall.

IMT530A- Organization of Information Resources 4

Module 3b Outline

• Metadata defined• Origins of metadata theory• Types of metadata• Metadata schemas• Objectives of metadata• Using metadata• Encoding metadata• Creating metadata• Metadata issues

Page 5: Module 3b: Metadata IMT530: Organization of Information Resources Winter 2007 Michael Crandall.

IMT530A- Organization of Information Resources 5

What is Metadata?

• Data about data• Definitional data that provides information

about or documentation of other data managed within an application or environment… metadata may include descriptive information about the context, quality and condition, or characteristics of the data (FOLDOC)

• Levels of complexity– Simple (embedded in object; e.g., a hyperlink)– Structured (Dublin Core, content management)– Rich (library MARC records, Encoded Archival

Description)

Page 6: Module 3b: Metadata IMT530: Organization of Information Resources Winter 2007 Michael Crandall.

IMT530A- Organization of Information Resources 6

Origins

• Library science– Focus is on entities as containers for information– Emphasis is on resource discovery– Tight focus resulted in widespread standards

• Data management– Focus is on the information itself– Much more complex information spaces (e.g.,

NASA satellite data)– Much more varied types of information and use– Emphasis is on data use (authenticity, authority)– Standards tend to be associated with data types

Page 7: Module 3b: Metadata IMT530: Organization of Information Resources Winter 2007 Michael Crandall.

IMT530A- Organization of Information Resources 7

Types of Metadata

• Administrative– Object management– Rights and access management– Maintenance and preservation– Meta-metadata for managing metadata

• Structural or technical– Describes relationships between parts– Enables recognition and use of objects by systems

• Descriptive– Describes characteristics of object– Physical and aboutness (subject)

Page 8: Module 3b: Metadata IMT530: Organization of Information Resources Winter 2007 Michael Crandall.

IMT530A- Organization of Information Resources 8

Metadata Schemas

• Sets of metadata elements designed to meet the needs of a community

• The elements are the fields that hold values authorized for use in the schema

• Many different needs, so many different schemas are available

• Three primary components– Structure: the model used to derive the schema (e.g., RDF)– Semantics: the meaning of the elements

• Values are specified through rules or vocabularies (“encoding schemes” or authority control)

– Syntax: the method for encoding the schema (e.g., XML, XHTML)

Page 9: Module 3b: Metadata IMT530: Organization of Information Resources Winter 2007 Michael Crandall.

IMT530A- Organization of Information Resources 9

Schema Characteristics

• Interoperability– Structural (same model)– Semantic (same meaning for elements)– Syntactic (same encoding format)

• Flexibility– Ability to use parts or all of elements and values

• Extensibility – Allows addition or qualification of elements to meet

local needs– Tradeoff with interoperability

Page 10: Module 3b: Metadata IMT530: Organization of Information Resources Winter 2007 Michael Crandall.

IMT530A- Organization of Information Resources 10

Objectives of Metadata

• Find– Through search engines, catalogs, etc.

• Identify– Distinguishing between items for purposes of use

• Select– By attributes such as language, format, genre, etc.

• Obtain– Either directly or through location/ordering metadata

• Navigate– For example, categories on web sites

• Manage– Content management systems– Document repositories

Page 11: Module 3b: Metadata IMT530: Organization of Information Resources Winter 2007 Michael Crandall.

IMT530A- Organization of Information Resources 11

Using Metadata

• Application profiles– Collection of elements from multiple schemas used to meet

local needs– May extend or refine if allowed by rules in schema

namespace– Can’t add new elements or you’re creating a new schema

• Registries– Machine-accessible repositories of schemas– Allow reuse and sometimes interoperability

• Crosswalks– Manual equivalence tables across schemas– Often used to provide partial interoperability across systems– Difficult to achieve 1:1 correspondence, however

• Roll your own– Most common approach in business applications

Page 12: Module 3b: Metadata IMT530: Organization of Information Resources Winter 2007 Michael Crandall.

IMT530A- Organization of Information Resources 12

Encoding Metadata (Syntax)

• Translating a metadata schema to syntax is essential – For machines to be able to access the metadata record– For display of metadata elements/values– For record transmission

• The standard in the library world is MARC (Machine-Readable Cataloging)– Current version is MARC 21

• Current standard in most other worlds is XML and its various flavors created for specific applications– Controlled by DTD (Document Type Descriptions) or XML

schemas– Advantage of schema is that it is also expressed in XML, so

can be referred to easily by other XML applications

Page 13: Module 3b: Metadata IMT530: Organization of Information Resources Winter 2007 Michael Crandall.

IMT530A- Organization of Information Resources 13

Example of MARC Record

01187nam 2200337 a 4500001001200000003000600012005001700018008004100035035001500076040001800091049000900109074001400118074002300132086001500155099001900170100003000189245007800219260010600297300001900403440003000422500001900452500003400471500002000505530009200525610003400617650002600651700002500677710007600702856006000778949001100838tmp96303807OCoLC19970728102440.0971114s1996 dcu f000 0 eng d a1258-02760 dGPOdDLCdMvI aVPII a0378-H-12 a0378-H-12 (online)0 aD 5.417:84 aDocs D5.417:841 aOakley, Robert B.,d1931-10aPolicing the new world disorder /cby Robert Oakley and Michael Dziedzic. a[Washington, D.C.?] :bNational Defense University, Institute for National Strategic Studies,c[1996] a4 p. ;c28 cm. 0aStrategic forum ;vno. 84 aCaption title. aShipping list no.: 97-0045-P. a"October 1996." aAlso available via Internet from the Institute for National Strategic Studies web site.20aUnited NationsxArmed Forces. 0aInternational police.1 aDziedzic, Michael J.2 aNational Defense University.bInstitute for National Strategic Studies.7 uhttp://www.ndu.edu/ndu/inss/strforum/forum84.html2http

Page 14: Module 3b: Metadata IMT530: Organization of Information Resources Winter 2007 Michael Crandall.

IMT530A- Organization of Information Resources 14

XML Version of MARC Record  <?xml version="1.0" encoding="UTF-8" ?> - <sequence xmlns="http://www.dlib.vt.edu/projects/OAi/marcxml/container" xmlns:xsi="http://www.w3.org/2000/10/XMLSchema-

instance" xsi:schemaLocation="http://www.openarchives.org/OAI/oai_marc http://www.openarchives.org/OAI/oai_marc.xsd http://www.dlib.vt.edu/projects/OAi/marcxml/container http://www.dlib.vt.edu/projects/OAi/marcxml/container.xsd">

- <oai_marc xmlns="http://www.openarchives.org/OIA/oai_marc" status="n" type="a" level="m" catForm="a">  <fixfield id="1">"tmp96303807"</fixfield>   <fixfield id="3">"OCoLC"</fixfield>   <fixfield id="5">"19970728102440.0"</fixfield>   <fixfield id="8">"971114s1996 dcu f000 0 eng d"</fixfield> - <varfield id="35" i1="" i2="">  <subfield label="a">1258-02760</subfield>   </varfield>- <varfield id="40" i1="" i2="">  <subfield label="d">GPO</subfield>   <subfield label="d">DLC</subfield>   <subfield label="d">MvI</subfield>   </varfield>- <varfield id="49" i1="" i2="">  <subfield label="a">VPII</subfield>   </varfield>- <varfield id="74" i1="" i2="">  <subfield label="a">0378-H-12</subfield>   </varfield>- <varfield id="74" i1="" i2="">  <subfield label="a">0378-H-12 (online)</subfield>   </varfield>- <varfield id="86" i1="0" i2="">  <subfield label="a">D 5.417:84</subfield>   </varfield>

Page 15: Module 3b: Metadata IMT530: Organization of Information Resources Winter 2007 Michael Crandall.

IMT530A- Organization of Information Resources 15

XML Version of MARC Record--<varfield id="99" i1="" i2="">  <subfield label="a">Docs D5.417:84</subfield>   </varfield> <varfield id="100" i1="1" i2="">  <subfield label="a">Oakley, Robert B.,</subfield>   <subfield label="d">1931-</subfield>   </varfield>- <varfield id="245" i1="1" i2="0">  <subfield label="a">Policing the new world disorder /</subfield>   <subfield label="c">by Robert Oakley and Michael Dziedzic.</subfield>   </varfield>- <varfield id="260" i1="" i2="">  <subfield label="a">[Washington, D.C.?] :</subfield>   <subfield label="b">National Defense University, Institute for National Strategic Studies,</subfield>   <subfield label="c">[1996]</subfield>   </varfield>- <varfield id="300" i1="" i2="">  <subfield label="a">4 p. ;</subfield>   <subfield label="c">28 cm.</subfield>   </varfield>- <varfield id="440" i1="" i2="0">  <subfield label="a">Strategic forum ;</subfield>   <subfield label="v">no. 84</subfield>   </varfield>- <varfield id="500" i1="" i2="">  <subfield label="a">Caption title.</subfield>   </varfield>- <varfield id="500" i1="" i2="">  <subfield label="a">Shipping list no.: 97-0045-P.</subfield>   </varfield>

Page 16: Module 3b: Metadata IMT530: Organization of Information Resources Winter 2007 Michael Crandall.

IMT530A- Organization of Information Resources 16

XML Version of MARC Record- <varfield id="500" i1="" i2="">  <subfield label="a">"October 1996."</subfield>   </varfield>- <varfield id="530" i1="" i2="">  <subfield label="a">Also available via Internet from the Institute for National Strategic Studies web site.</subfield>   </varfield>- <varfield id="610" i1="2" i2="0">  <subfield label="a">United Nations</subfield>   <subfield label="x">Armed Forces.</subfield>   </varfield>- <varfield id="650" i1="" i2="0">  <subfield label="a">International police.</subfield>   </varfield>- <varfield id="700" i1="1" i2="">  <subfield label="a">Dziedzic, Michael J.</subfield>   </varfield>- <varfield id="710" i1="2" i2="">  <subfield label="a">National Defense University.</subfield>   <subfield label="b">Institute for National Strategic Studies.</subfield>   </varfield>- <varfield id="856" i1="7" i2="">  <subfield label="u">http://www.ndu.edu/ndu/inss/strforum/forum84.html</subfield>   <subfield label="2">http</subfield>   </varfield>- <varfield id="949" i1="" i2="">  <subfield label="a">000103</subfield>   </varfield>  </oai_marc>

Page 17: Module 3b: Metadata IMT530: Organization of Information Resources Winter 2007 Michael Crandall.

IMT530A- Organization of Information Resources 17

MARC Record in TextCall Number: Docs D5.417:84 Authors: Oakley, Robert B., 1931-                Dziedzic, Michael J.                National Defense University. Institute for National Strategic Studies.

Titles: Policing the new world disorder / / by Robert Oakley and Michael Dziedzic.            Strategic forum ; / no. 84

Imprint: [Washington, D.C.?] : / National Defense University, Institute for National Strategic Studies, / [1996]   Description: 4 p. ; 28 cm.

Notes: Caption title.            Shipping list no.: 97-0045-P.            "October 1996."      Access: URL:  http://www.ndu.edu/ndu/inss/strforum/forum84.html   |2|:   http

Subjects: United Nations — Armed Forces.                 International police.

Page 18: Module 3b: Metadata IMT530: Organization of Information Resources Winter 2007 Michael Crandall.

IMT530A- Organization of Information Resources 18

Creating Metadata

• We’ve focused on building metadata structures, but someone has to actually create the metadata values used in a system

• Structural and administrative metadata values are often applied when information is created, or generated automatically by authoring tools

• Descriptive metadata is harder to create– In libraries, traditionally has been done by trained

professionals– Some automated tools have shown limited success in narrow

domains– End users have not generally been a good source unless

forced to as part of document creation– Quality often suffers when trained indexers are not used

Page 19: Module 3b: Metadata IMT530: Organization of Information Resources Winter 2007 Michael Crandall.

IMT530A- Organization of Information Resources 19

Metadata Issues

• Make sure you can measure results

• Don’t assume one size fits all

• Choose user access points wisely

• Provide user tools and education for effective use of your metadata

• Make sure you’re adding value

• Balance theory with practical needs

• Trust and provenance

Page 20: Module 3b: Metadata IMT530: Organization of Information Resources Winter 2007 Michael Crandall.

IMT530A- Organization of Information Resources 20

Questions?

• If not, take a break!!!

Page 21: Module 3b: Metadata IMT530: Organization of Information Resources Winter 2007 Michael Crandall.

IMT530A- Organization of Information Resources 21

Exercise 3

• Find your groups

• Spend the next 30 minutes exploring the examples in Exercise 3

• Ask questions and talk!!!

• Be sure to hand in completed work at the end of class for credit!!!

Page 22: Module 3b: Metadata IMT530: Organization of Information Resources Winter 2007 Michael Crandall.

IMT530A- Organization of Information Resources 22

Next Week

• We’ll look at application profiles and selection of metadata elements for description and access (Part 1 of your assignment)

• Remember to read assignments BEFORE class

• Have a great weekend!!