OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill...

30
OCLC Online Computer Library Center Virtual International Authority File Prepared by Ed O’Neill and Rick Bennett, OCLC Presented by Alison Hall International Association of Music Libraries Oslo, Norway, August 2004

Transcript of OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill...

Page 1: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

OCLC Online Computer Library Center

Virtual International

Authority File

Prepared by Ed O’Neill and Rick Bennett, OCLC

Presented by Alison Hall

International Association of Music Libraries

Oslo, Norway, August 2004

Page 2: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

Background

The IFLA Section on Cataloguing recognized the need for a virtual international authority file where:

Authority records from the world’s national bibliographic agencies could be linkedAvailable via the Internet. Practical expansion of the concept of universal bibliographic control.Build on the work done by each national bibliographic agencyAllow national or regional variations in authorized form to co-existSupport worldwide users’ needs for variations in preferred language, script, and spelling.

Page 3: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

Background

The VIAF could be one of the basic building blocks to a “semantic web”

When combined with other controlled vocabularies and authority files from such sources as abstracting and indexing services, archives, museums, publishers, etc.

Libraries now have an opportunity to make a great contribution to this future and should help make this vision a reality.

It is important to the development of this shared vision that the VIAF be made freely available to users worldwide on the web.

Page 4: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

Joint Project

A project to test the concept of a VIAF is being jointly undertaken by:

Die Deutsche Bibliothek (DDB)

The Library of Congress (LC)

OCLC Online Computer Library Center (OCLC)

Page 5: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

Project Goal

Demonstrate the feasibility of VIAF by linking the personal names authority records between:

Personennormdatei (PND)

Library of Congress Name Authority File (LCNAF)

Page 6: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

What is the VIAF?

The VIAF will be a file of metadata to link users from records in one national bibliographic agency’s personal name authority file to matching records in other national authority files.

The VIAF will provide for access on the web through a specially designed user interface.

The VIAF will provide for multi-lingual and multi-script capability.

The VIAF will use Open Archive Initiative (OAI) protocols to harvest metadata from the agencies’ authority files, which would then be added to the shared servers to keep the file updated.

Page 7: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

Long Term Goal

The system is being designed so that any number of authority files can be linked—not just the initial two being initially used.

Page 8: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

The Problem

In the LCNAF and PND authority files:

A particular person will have the same established form in both authority files (the ideal)

Different people may be assigned the same established form

Different forms of the name may be established for the same person

Page 9: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

Two People – One Name

Adams, Mike

In the PND, the name is established for a golfer

In LCNAF, the name is established for an author of a Beatles collecter’s guide

Page 10: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

Two Names – One Person

LC: Morel, Pierre

PND: Morellus, Petrus 

Page 11: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

Brief Authorty

010 n 84044261

040 DLC $c DLC $d DLC

100 1 Larson, Jack.

670 Thomson, V. The cat, c1982:

$b t.p. (Jack Larson)

Page 12: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

Information in Bibliographic Records

From the bibliographic records we gain significant additional information about Jack Larson:

He is a lyricist

His primary subject area is music (Subject No. 234)

Was published in the 80s and 90s by G. Schirmer and Belwin Mills in New York

Worked with Virgil Thomson and Gerhard Samuel

Jack Larson is the only name he used on his publications

etc.

Page 13: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

Project Phases

Phase 1: Build enhances authority files for both PND and LC person names

Phase 2: Match PND and LC enhances authority records to create the initial version of the VIAF

Phase 3: Build OAI Server

Phase 4: Ongoing maintenance and metadata harvesting using OAI protocols

Phase 5: Build end user interface with unicode displays

Page 14: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

Phase 1

Building the Enhances Authority Files

Authority records generally including very few, if any, details about the person and/or their publishing history

The information is rarely sufficient to determine if two different authority records represent the same person

To provide additional information to unambiguously match authority records for same author, information from bibliographic records is used to enhance the authority record

Page 15: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

Enhancing the Authorities

Bibliographic

Record

Derived Authorit

y

Authority

Record

Enhanced

Authority

Page 16: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

Mining the Bibliographic Record

LDR 00826ccm 2200289 a 4500 1 ocm10025532 5 20031229650847.0 8 840627s1982 nyuuua n eng 10 $a 84758340 40 $a DLC $c DLC 19 $a 17706440 20 $c $2.95 28 22 $a 48418 $b G. Schirmer 45 2 $b d198006 $b d198007 48 $b va01 $b ve01 $a ka01 50 00 $a M1529.3 $b .T100 1 $a Thomson, Virgil, $d 1896-245 14 $a The cat : $b duet for soprano and baritone / $c Virgil Thomson ; [words by Jack Larson].260 $a New York : $b G. Schirmer, $c c1982.300 $a 1 score (11 p.) ; $c 31 cm.500 $a For soprano, baritone, and piano.650 0 $a Vocal duets with piano.600 10 $a Larson, Jack $x Musical settings.700 1 $a Larson, Jack.

Authors

LC Control Number

LC ClassificationTitl

e

Material Type

Publisher

Place of Publication

Language

Date ofPublication

UsageUsage

Page 17: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

Derived Authority Record

00525nz 2200229n 4500 0 1 xlc 1 1 3 OCoLC 2 5 20040721111415.0 3 8 040721nneanz||abbn n and d 4 40 $a OCoLC $b eng $c OCoLC $f viaf 5 100 1 $a Larson, Jack. 6 903 $a 84758340 7 910 14 $a the cat $b duet for soprano and baritone 8 921 $a g schirmer 9 922 $a nyu10 930 $a jack larson11 940 $a eng12 942 $a 23413 943 $a 198x14 944 $a cm15 950 1 $a thomson, virgil $d 1896

All text is normalized

All text is normalizedSubjects are grouped into

approximately subject groups

Material type is codedPublication date is by decade

Coauthor

Page 18: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

Enhanced Authority Record

00824nz 2200301n 4500 0 1 oca01144962 1 5 19840809154202.7 2 8 840702n| acannaab| |n aaa ||| 3 10 $a n 84044261 4 40 $a DLC $c DLC $d DLC 5 100 1 $a Larson, Jack. 6 670 $a Thomson, V. The cat, c1982: $b t.p. (Jack Larson) 7 903 $a 84758340 $9 1 8 903 $a 93710923 $9 1 9 910 11 $a the cat $b duet for soprano and baritone $9 110 910 11 $a sun like $b on a poem by jack larson $9 111 921 $a g schirmer $9 112 921 $a belwin mills publ corp $9 213 922 $a nyu $9 214 930 $a jack larson $9 115 940 $a eng $9 216 942 $a 234 $9 217 943 $a 198x $9 118 943 $a 197x $9 119 944 $a cm $9 220 950 11 $a thomson, virgil $d 1896 $9 121 950 11 $a samuel, gerhard $9 1

Page 19: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

NACO Personal Name Authorities

Differentiated names: 3,854,587

Undifferentiated names: 38,010

Total authority records: 3,892,597

Page 20: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

LC Bibliographic Records

Number of records: 6,118,657

Personal Names assigned: 6,569,957

Unique Personal Names: 2,674,687

Page 21: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

PND Personal Name Authorities

Total authority records:2,498,071

Page 22: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

DDB Bibliographic Records

Die Deutsche Bibliothek (DDB): 6,316,675

Bibliotheksverbund Bayern (BVB): 5,022,316

Total number of records: 11,338,991

Number of assignments: 12,080,387

Number of unique names: 2,371,461

Page 23: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

Phase 2

Matching the Enhanced Authorities

Page 24: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

Linking Retrospective Files

MatchingAlgorithms

Enhanced LCNAF

Authorities

Enhanced PND Authorities

VIAF Authorities

Page 25: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

Future of VIAF?

If the proof-of-concept is successful, the VIAF will be expanded:

To include other authority files for personal names,

To include other types of authorities

– Corporate names,– Geographic names,– etc.

Page 26: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

First VIAF Record

Page 27: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

Phase 3: Build OAI Server

LCNAF

DDB/PND

OAI

Server(s)

Slide Courtesy of Barbara Tillett, Library of Congress

Page 28: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

Phase 4: Ongoing maintenance and metadata harvesting using

OAI protocols

Slide Courtesy of Barbara Tillett, Library of Congress

Page 29: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

Phase 5: Build End User Interface with unicode displays

User’s cookie specifies hongul is preferred.

Display 700 form, building on local system’s authority structure

Slide Courtesy of Barbara Tillett, Library of Congress

Page 30: OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.

Questions?

Thank you

[email protected]

[email protected]

http://www.oclc.org/research/projects/viaf