2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction...
-
date post
15-Jan-2016 -
Category
Documents
-
view
214 -
download
0
Transcript of 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction...
IS 257 – Fall 2009 2009.01.21 - SLIDE 1
Organization of Information in Collections:
IntroductionUniversity of California, Berkeley
School of InformationIS 245: Organization of Information In
Collections
IS 257 – Fall 2009 2009.01.21 - SLIDE 2
Lecture Contents
• Course Introduction
• Organization of Information
• Metadata
• Dublin Core
• Controlled Vocabularies
• Discussion
IS 257 – Fall 2009 2009.01.21 - SLIDE 3
Lecture Contents
• Course Introduction
• Organization of Information
• Metadata
• Dublin Core
• Controlled Vocabularies
• Discussion
IS 257 – Fall 2009 2009.01.21 - SLIDE 4
Course contents
• Metadata and Metadata Schemas
• Bibliographic Description
• Access Points and Vocabulary Control
• Topical/Subject Description
• Thesaurii
• Ontologies
• Other Metadata/Description/Organization topics
IS 257 – Fall 2009 2009.01.21 - SLIDE 5
COURSE OUTLINE
• Among the topics that will be covered during the semester are a number of “traditional” library-related topics:
• BIBLIOGRAPHIC DESCRIPTION– Introduction to the use of standards and
codes for description of bibliographic materials including the International Standard Bibliographic Description and the Anglo-American Cataloging Rules.
IS 257 – Fall 2009 2009.01.21 - SLIDE 6
COURSE OUTLINE
• ACCESS– 1. Access by names--Issues and problems
including name authority control– 2. Access by subject
• a. Types of access: descriptors; index terms--including types of indexes (e.g. KWIC, KWOC); subject headings; relational indexes (e.g. PRECIS)
• b. Vocabulary control--role of the thesauri and their use (e.g. Library of Congress Subject Headings; Medical Subject Headings; the Art and Architecture Thesaurus)
IS 257 – Fall 2009 2009.01.21 - SLIDE 7
COURSE OUTLINE
• ACCESS (cont.)• c. Classification schemes and their uses: shelf
arrangement; organization of printed lists; thesaurus hierarchies
• d. Subject authority control
– 3. Access by other attributes• a. Physical attributes of documents: title, text• b. Other attributes: language, uniform title
IS 257 – Fall 2009 2009.01.21 - SLIDE 8
COURSE OUTLINE
• ACCESS (cont.)– 4. Use of multiple access points: e.g. subject
and date– 5. Evaluation of different access points within
systems (e.g. Purposes served by access through classification scheme and alphabetical subject terms within a catalog or index)
IS 257 – Fall 2009 2009.01.21 - SLIDE 9
COURSE OUTLINE
• Metadata and Metadata Schemas– MARC– MODS– METS– EAD– EAC– Dublin Core– OWL – RDF– FRBR (which isn’t really a schema, but a
model)
IS 257 – Fall 2009 2009.01.21 - SLIDE 10
Course Requirements
• Assignments and exercises (30%)
• Final Paper/Project (60%)– Can be a traditional research paper on an
organizational topic or a project such as construction of a Thesaurus or Ontology for a particular topical area.
– Could be part of a MIMS final project
• Class Participation – including class reports (10%)
IS 257 – Fall 2009 2009.01.21 - SLIDE 11
Lecture Contents
• Course Introduction
• Organization of Information
• Metadata
• Dublin Core
• Controlled Vocabularies
• Discussion
IS 257 – Fall 2009 2009.01.21 - SLIDE 12
Organization of Information
• Is there a basic human need to put things into some sort of order?– Much of natural language concerns
categories of things rather than individual things
– Why do we organize things and information?• Why do spoons go in THAT drawer in the kitchen
and not in a can in the garage?• Why do your favorite books go on one shelf and
not-so-favorite on another?
IS 257 – Fall 2009 2009.01.21 - SLIDE 13
Why Organize Information?
• The main reason– So that you can find things more effectively
• I.e., effective retrieval is predicated on some sort of organization applied to information resources
• Historically there have been many institutions and tools devoted to information organization– Libraries– Museums– Archives– Indexes and catalogs, dictionaries, phone books, etc.
IS 257 – Fall 2009 2009.01.21 - SLIDE 14
Why Organize Information?
• A question of scale– Using your own ad hoc set of categories and
methods to organize your own collection of books or CDs seems to work fine…
– What if your collection grew to• 10 Times the size? How would you organize it?• 100 Times? • 1000 Times?• 100000 times?• What if it wasn’t physical objects, but electronic?
IS 257 – Fall 2009 2009.01.21 - SLIDE 15
What is Information Organization?
• Identifying the existence of all types of information-bearing entities as they are made available
• Identifying the works contained within those information-bearing entities or as parts of them
• Systematically pulling together these information-bearing entities into collections in libraries, archives, museums, Internet communications files and other such depositories
From Hagler via Taylor, Chap. 1
IS 257 – Fall 2009 2009.01.21 - SLIDE 16
What is Information Organization?
• Producing lists of these information-bearing entities prepared according to standard rules for citation
• Providing name, title, subject and other useful access to these information-bearing entities
• Providing the means of locating each information-bearing entity or a copy of it
IS 257 – Fall 2009 2009.01.21 - SLIDE 17
Key Issues in This Course
• How to describe information resources or information-bearing objects in ways so that they may be effectively used by those who need to use them– Organizing
• How to find the appropriate information resources or information-bearing objects for someone’s (or your own) needs– Retrieving
IS 257 – Fall 2009 2009.01.21 - SLIDE 18
Key Issues
Creation
Utilization Searching
Active
Inactive
Semi-Active
Retention/Mining
Disposition
Discard
Using Creating
AuthoringModifying
OrganizingIndexing
StoringRetrieval
DistributionNetworking
AccessingFiltering
IS 257 – Fall 2009 2009.01.21 - SLIDE 19
Organizing/Indexing
• Collecting and integrating information
• Affects data, information and metadata
• “Metadata” describes data and information– More on this shortly
• Organizing information– Types of organization?
• Indexing
IS 257 – Fall 2009 2009.01.21 - SLIDE 20
Accessing/Filtering
• Using the organization created in the O/I stage to– Select desired (or relevant) information– Locate that information– Retrieve the information from its storage
location (often via a network)
IS 257 – Fall 2009 2009.01.21 - SLIDE 21
Structure of an IR System
Interest profiles& Queries
Documents & data
Rules of the game =Rules for subject indexing +
Thesaurus (which consists of
Lead-InVocabulary
andIndexing
Language
StorageLine
Potentially Relevant
Documents
Comparison/Matching
Store1: Profiles/Search requests
Store2: Documentrepresentations
Indexing (Descriptive and
Subject)
Formulating query in terms of
descriptors
Storage of profiles
Storage of Documents
Information Storage and Retrieval System
IS 257 – Fall 2009 2009.01.21 - SLIDE 22
Lecture Contents
• Course Introduction
• Organization of Information
• Metadata
• Dublin Core
• Controlled Vocabularies
• Discussion
IS 257 – Fall 2009 2009.01.21 - SLIDE 23
Metadata
• Metadata is– “Data about Data” (database systems)– Information about Information
• First used (to the best we can discover) in 1978 (meta-data)
• Used for databases in (Meta-Data Base)– “a data base which itself contains the structural and
semantic data of other data bases”» Thomas R. Cousins & Wayne D. Dominick, “The
Management of Data Bases of Data Bases” ASIS Proceedings, 1978.
IS 257 – Fall 2009 2009.01.21 - SLIDE 24
Metadata
• Structures and languages for the description of information resources and their elements (components or features)
• “Metadata is information on the organization of the data, the various data domains, and the relationship between them” (Baeza-Yates p. 142)
IS 257 – Fall 2009 2009.01.21 - SLIDE 25
Metadata
• Often two main types of metadata are distinguished– Descriptive metadata
• Describes the information/data object and its properties
• May use a variety of descriptive formats and rules
– Topical metadata• Describes the topic or “aboutness” of an
information/data object • May include a variety of vocabularies for
describing, subjects, topics, categories, etc.
IS 257 – Fall 2009 2009.01.21 - SLIDE 26
Types of Metadata
• Element names
• Element description
• Element representation
• Element coding
• Element semantics
• Element classification
IS 257 – Fall 2009 2009.01.21 - SLIDE 27
Metadata Systems and Standards
• Naming and ID systems• Bibliographic description
– Texts
• Music• Images and objects• Numeric data• Geospatial data• Collections• Video and motion pictures
IS 257 – Fall 2009 2009.01.21 - SLIDE 28
The Same Item in Different Metadata Systems
• ISBD
• RFC 1807
• TEI Header
• MARC Record
• Dublin Core (a bit later)
IS 257 – Fall 2009 2009.01.21 - SLIDE 29
ISBD Punctuation
• Title Proper (GMD) = Parallel title : other title info / First statement of responsibility ; others. -- Edition information. -- Material. -- Place of Publication : Publisher Name, Date. -- Material designation and extent ; Dimensions of item. -- (Title of Series / Statement of responsibility). -- Notes. -- Standard numbers: terms of availability (qualifications).
IS 257 – Fall 2009 2009.01.21 - SLIDE 30
Bibliographic Record
• Introduction to cataloging and classification / Bohdan S. Wynar. -- 8th ed. / Arlene G. Taylor. -- Englewood, Colo. : Libraries Unlimited, 1992. -- (Library science text series).
IS 257 – Fall 2009 2009.01.21 - SLIDE 31
RFC 1807
• BIB-VERSION:: CS-TR-v2.1• ID:: UCB//123456• ENTRY:: September 9, 1997• TYPE:: BOOK• TITLE:: Introduction to cataloging and classification• AUTHOR:: Wynar, Bohdan S.• AUTHOR:: Taylor, Arlene G.• DATE:: 1992• PAGES:: 633• COPYRIGHT:: Libraries Unlimited, 1992• SERIES:: Library Science Text Series• END:: UCB//123456
IS 257 – Fall 2009 2009.01.21 - SLIDE 32
Minimal TEI Header
• <teiHeader>• <fileDesc>• <titleStmt>• <title> Introduction to cataloging and classification</title>• <respStmt><name>Bohdan S. Wynar<resp> 8th edition by</resp>• <name>Arlene G. Taylor</name>• </respStmt>• </titleStmt>• <publicationStmt>• <distributor>Libraries Unlimited</distributor>• </publicationStmt>• <sourceDesc>• <bibl> Introduction to cataloging and classification / Bohdan S. Wynar. -- 8th
ed. / Arlene G. Taylor. -- Englewood, Colo. : Libraries Unlimited, 1992. • </bibl>• </sourceDesc>• </fileDesc>• <teiHeader>
IS 257 – Fall 2009 2009.01.21 - SLIDE 33
MARC Record (Display)
• ID:DCLC9124851-B RTYP:c ST:p FRN: MS:c EL: AD:06-20-91• CC:9110 BLT:am DCF:a CSC: MOD: SNR: ATC: UD:04-11-92• CP:cou L:eng INT: GPC: BIO: FIC:0 CON:b• PC:s PD:1992/ REP: CPI:0 FSI:0 ILC:a II:1• MMD: OR: POL: DM: RR: COL: EML: GEN: BSE:• 010 9124851• 020 0872878112 (cloth)• 020 0872879674 (paper)• 040 DLC$cDLC$dDLC• 050 00 Z693$b.W94 1991• 082 00 025.3$220• 100 1 Wynar, Bohdan S.• 245 10 Introduction to cataloging and classification /$cBohdan S. Wynar.• 250 8th ed. /$bArlene G. Taylor.• 260 Englewood, Colo. :$bLibraries Unlimited,$c1992.• 300 xvii, 633 p. :$bill. ;$c24 cm.• 440 0 Library science text series• 504 Includes bibliographical references (p. 591-599) and index.• 650 0 Cataloging.• 650 0 Subject cataloging.• 650 0 Classification$xBooks.• 630 00 Anglo-American cataloguing rules.• 700 10 Taylor, Arlene G.,$d1941-
IS 257 – Fall 2009 2009.01.21 - SLIDE 34
Lecture Contents
• Course Introduction
• Organization of Information
• Metadata
• Dublin Core
• Controlled Vocabularies
• Discussion
IS 257 – Fall 2009 2009.01.21 - SLIDE 35
Dublin Core
• Simple metadata for describing internet resources
• For “Document-Like Objects”
• 15 Elements (in base DC)
IS 257 – Fall 2009 2009.01.21 - SLIDE 36
Dublin Core (original version)
• TITLE: Introduction to cataloging and classification• CREATOR: Taylor, Arlene G.• OTHER CONTRIBUTOR: Wynar, Bohdan S.• DATE: 1992• FORMAT: BOOK• LANGUAGE: ENG• PAGES: 633• PUBLISHER: Libraries Unlimited• SUBJECT: Cataloging.• SUBJECT: subject cataloging.• SUBJECT: Classification -- Books• DESCRIPTION: Textbook on cataloging and classification• RESOURCE TYPE: text.monograph• RESOURCE IDENTIFIER: (ISBN) 0872879674
IS 257 – Fall 2009 2009.01.21 - SLIDE 37
Dublin Core (XML)
<Title>Introduction to cataloging and classification</Title><Creator> Taylor, Arlene G.</Creator><Contributor>Wynar, Bohdan S.</Contributor><Date> 1992</Date><Format> BOOK</Format><Language> ENG</Language><Format> 633 pages</Format><Publisher> Libraries Unlimited</Publisher><Subject> Cataloging.</Subject><Subject> subject cataloging .</Subject><Subject> Classification -- Books .</Subject><Description> Textbook on cataloging and classification</Description><Type> text.monograph </Type><Identifier> (ISBN) 0872879674</Identifier>
IS 257 – Fall 2009 2009.01.21 - SLIDE 38
Dublin Core Elements
• Title
• Creator
• Subject
• Description
• Publisher
• Contributor
• Date
• Type
• Format
• Identifier
• Source
• Language
• Relation
• Coverage
• Rights
IS 257 – Fall 2009 2009.01.21 - SLIDE 39
Mega-Metadata Standards
• METS - Metadata Encoding and Transmission Standard (http://www.loc.gov/standards/mets)– Developed by the Digital Library Federation as an
implementation strategy for preservation metadata– "XML document format for encoding metadata
necessary for both management of digital library objects within a repository and exchange of such objects between repositories (or between repositories and their users)”
– Provides a flexible mechanism for encoding descriptive, administrative, and structural metadata for a digital library object, and for expressing the complex links between these various forms of metadata
IS 257 – Fall 2009 2009.01.21 - SLIDE 40
Metadata Resources
• Check the Links section from the class home page
• Best site is the “Digital Library: Metadata Resources” page from IFLA at http://www.ifla.org/II/metadata.htm
• For another good source of information on metadata standards see http://www.chin.gc.ca/English/Standards
IS 257 – Fall 2009 2009.01.21 - SLIDE 41
Lecture Contents
• Course Introduction
• Organization of Information
• Metadata
• Dublin Core
• Controlled Vocabularies (Introduction)
• Discussion
IS 257 – Fall 2009 2009.01.21 - SLIDE 42
Controlled Vocabularies
• Next time…