OCLC Online Computer Library Center
Virtual International
Authority File
Prepared by Ed O’Neill and Rick Bennett, OCLC
Presented by Alison Hall
International Association of Music Libraries
Oslo, Norway, August 2004
Background
The IFLA Section on Cataloguing recognized the need for a virtual international authority file where:
Authority records from the world’s national bibliographic agencies could be linkedAvailable via the Internet. Practical expansion of the concept of universal bibliographic control.Build on the work done by each national bibliographic agencyAllow national or regional variations in authorized form to co-existSupport worldwide users’ needs for variations in preferred language, script, and spelling.
Background
The VIAF could be one of the basic building blocks to a “semantic web”
When combined with other controlled vocabularies and authority files from such sources as abstracting and indexing services, archives, museums, publishers, etc.
Libraries now have an opportunity to make a great contribution to this future and should help make this vision a reality.
It is important to the development of this shared vision that the VIAF be made freely available to users worldwide on the web.
Joint Project
A project to test the concept of a VIAF is being jointly undertaken by:
Die Deutsche Bibliothek (DDB)
The Library of Congress (LC)
OCLC Online Computer Library Center (OCLC)
Project Goal
Demonstrate the feasibility of VIAF by linking the personal names authority records between:
Personennormdatei (PND)
Library of Congress Name Authority File (LCNAF)
What is the VIAF?
The VIAF will be a file of metadata to link users from records in one national bibliographic agency’s personal name authority file to matching records in other national authority files.
The VIAF will provide for access on the web through a specially designed user interface.
The VIAF will provide for multi-lingual and multi-script capability.
The VIAF will use Open Archive Initiative (OAI) protocols to harvest metadata from the agencies’ authority files, which would then be added to the shared servers to keep the file updated.
Long Term Goal
The system is being designed so that any number of authority files can be linked—not just the initial two being initially used.
The Problem
In the LCNAF and PND authority files:
A particular person will have the same established form in both authority files (the ideal)
Different people may be assigned the same established form
Different forms of the name may be established for the same person
Two People – One Name
Adams, Mike
In the PND, the name is established for a golfer
In LCNAF, the name is established for an author of a Beatles collecter’s guide
Brief Authorty
010 n 84044261
040 DLC $c DLC $d DLC
100 1 Larson, Jack.
670 Thomson, V. The cat, c1982:
$b t.p. (Jack Larson)
Information in Bibliographic Records
From the bibliographic records we gain significant additional information about Jack Larson:
He is a lyricist
His primary subject area is music (Subject No. 234)
Was published in the 80s and 90s by G. Schirmer and Belwin Mills in New York
Worked with Virgil Thomson and Gerhard Samuel
Jack Larson is the only name he used on his publications
etc.
Project Phases
Phase 1: Build enhances authority files for both PND and LC person names
Phase 2: Match PND and LC enhances authority records to create the initial version of the VIAF
Phase 3: Build OAI Server
Phase 4: Ongoing maintenance and metadata harvesting using OAI protocols
Phase 5: Build end user interface with unicode displays
Phase 1
Building the Enhances Authority Files
Authority records generally including very few, if any, details about the person and/or their publishing history
The information is rarely sufficient to determine if two different authority records represent the same person
To provide additional information to unambiguously match authority records for same author, information from bibliographic records is used to enhance the authority record
Enhancing the Authorities
Bibliographic
Record
Derived Authorit
y
Authority
Record
Enhanced
Authority
Mining the Bibliographic Record
LDR 00826ccm 2200289 a 4500 1 ocm10025532 5 20031229650847.0 8 840627s1982 nyuuua n eng 10 $a 84758340 40 $a DLC $c DLC 19 $a 17706440 20 $c $2.95 28 22 $a 48418 $b G. Schirmer 45 2 $b d198006 $b d198007 48 $b va01 $b ve01 $a ka01 50 00 $a M1529.3 $b .T100 1 $a Thomson, Virgil, $d 1896-245 14 $a The cat : $b duet for soprano and baritone / $c Virgil Thomson ; [words by Jack Larson].260 $a New York : $b G. Schirmer, $c c1982.300 $a 1 score (11 p.) ; $c 31 cm.500 $a For soprano, baritone, and piano.650 0 $a Vocal duets with piano.600 10 $a Larson, Jack $x Musical settings.700 1 $a Larson, Jack.
Authors
LC Control Number
LC ClassificationTitl
e
Material Type
Publisher
Place of Publication
Language
Date ofPublication
UsageUsage
Derived Authority Record
00525nz 2200229n 4500 0 1 xlc 1 1 3 OCoLC 2 5 20040721111415.0 3 8 040721nneanz||abbn n and d 4 40 $a OCoLC $b eng $c OCoLC $f viaf 5 100 1 $a Larson, Jack. 6 903 $a 84758340 7 910 14 $a the cat $b duet for soprano and baritone 8 921 $a g schirmer 9 922 $a nyu10 930 $a jack larson11 940 $a eng12 942 $a 23413 943 $a 198x14 944 $a cm15 950 1 $a thomson, virgil $d 1896
All text is normalized
All text is normalizedSubjects are grouped into
approximately subject groups
Material type is codedPublication date is by decade
Coauthor
Enhanced Authority Record
00824nz 2200301n 4500 0 1 oca01144962 1 5 19840809154202.7 2 8 840702n| acannaab| |n aaa ||| 3 10 $a n 84044261 4 40 $a DLC $c DLC $d DLC 5 100 1 $a Larson, Jack. 6 670 $a Thomson, V. The cat, c1982: $b t.p. (Jack Larson) 7 903 $a 84758340 $9 1 8 903 $a 93710923 $9 1 9 910 11 $a the cat $b duet for soprano and baritone $9 110 910 11 $a sun like $b on a poem by jack larson $9 111 921 $a g schirmer $9 112 921 $a belwin mills publ corp $9 213 922 $a nyu $9 214 930 $a jack larson $9 115 940 $a eng $9 216 942 $a 234 $9 217 943 $a 198x $9 118 943 $a 197x $9 119 944 $a cm $9 220 950 11 $a thomson, virgil $d 1896 $9 121 950 11 $a samuel, gerhard $9 1
NACO Personal Name Authorities
Differentiated names: 3,854,587
Undifferentiated names: 38,010
Total authority records: 3,892,597
LC Bibliographic Records
Number of records: 6,118,657
Personal Names assigned: 6,569,957
Unique Personal Names: 2,674,687
DDB Bibliographic Records
Die Deutsche Bibliothek (DDB): 6,316,675
Bibliotheksverbund Bayern (BVB): 5,022,316
Total number of records: 11,338,991
Number of assignments: 12,080,387
Number of unique names: 2,371,461
Linking Retrospective Files
MatchingAlgorithms
Enhanced LCNAF
Authorities
Enhanced PND Authorities
VIAF Authorities
Future of VIAF?
If the proof-of-concept is successful, the VIAF will be expanded:
To include other authority files for personal names,
To include other types of authorities
– Corporate names,– Geographic names,– etc.
Phase 3: Build OAI Server
LCNAF
DDB/PND
OAI
Server(s)
Slide Courtesy of Barbara Tillett, Library of Congress
Phase 4: Ongoing maintenance and metadata harvesting using
OAI protocols
Slide Courtesy of Barbara Tillett, Library of Congress
Phase 5: Build End User Interface with unicode displays
User’s cookie specifies hongul is preferred.
Display 700 form, building on local system’s authority structure
Slide Courtesy of Barbara Tillett, Library of Congress
Questions?
Thank you
http://www.oclc.org/research/projects/viaf
Top Related