VirtualInternationalAuthorityFile
ALA, June 2006
Richard Bennett OCLC
Christel Hengel DDB
Thomas B. Hickey OCLC
Edward T. O’Neill OCLC
Barbara B. Tillett LC
VIAF
DDB/LC/OCLC
Virtual International Authority File
Link authority records from national bibliographic agencies
Build on their authority work
Expand the concept of universal bibliographic control
• Allow national or regional variations in authorized form to co-exist
• Support needs for variations in preferred language, script, and spelling
VIAF
DDB/LC/OCLC
Joint VIAF Project
VIAF
DDB/LC/OCLC
Semantic Web Building Blocks
End-user
A&I controlledvocabularies
(Library) authority files
Other controlled vocabularies
“Ontologies”
VIAF
DDB/LC/OCLC
Project Goal
Demonstrate feasibility of linking personal names across:
Personennormadatei (PND) Library of Congress Name Authority File (LCNAF)
VIAF
DDB/LC/OCLC
What is the VIAF?
System• Links between files• Web browser access• Multi-lingual and multi-scripts
Maintenance• National agencies control their records• Records harvested from national systems
Scalable• Any number of national authority files
VIAF
DDB/LC/OCLC
Matching Variations
In the LCNAF and PND authority files: Same name, same person Same name, different people Different names, same person Missing person in one file
VIAF
DDB/LC/OCLC
Two Different People – One Name
Adams, Mike PND: a golfer LCNAF: author of a Beatles collector's guide
Same Name
Different
People
VIAF
DDB/LC/OCLC
One Person – Two Names
LCNAF: Morel, Pierre PND: Morellus, Petrus
Same Person
Different
Names
VIAF
DDB/LC/OCLC
Enhancing the Authorities
Bibliographic
Record
Derived
Authority
Authority
Record
Enhanced
Authority
VIAF
DDB/LC/OCLC
Mining the Bibliographic Record
LDR 00826ccm 2200289 a 4500 1 ocm10025532 5 20031229650847.0 8 840627s1982 nyuuua n eng 10 $a 84758340 40 $a DLC $c DLC 19 $a 17706440 20 $c $2.95 28 22 $a 48418 $b G. Schirmer 45 2 $b d198006 $b d198007 48 $b va01 $b ve01 $a ka01 50 00 $a M1529.3 $b .T100 1 $a Thomson, Virgil, $d 1896-245 14 $a The cat : $b duet for soprano and baritone / $c Virgil Thomson ; [words by Jack Larson].260 $a New York : $b G. Schirmer, $c c1982.300 $a 1 score (11 p.) ; $c 31 cm.500 $a For soprano, baritone, and piano.650 0 $a Vocal duets with piano.600 10 $a Larson, Jack $x Musical settings.700 1 $a Larson, Jack.
Authors
LC Control Number
LC ClassificationTitl
e
Material Type
Publisher
Place of Publication
Language
Date of
Publication
Usage
VIAF
DDB/LC/OCLC
Derived Authority Record
00525nz 2200229n 4500
0 1 xlc 1
1 3 OCoLC
2 5 20040721111415.0
3 8 040721nneanz||abbn n and d
4 40 $a OCoLC $b eng $c OCoLC $f viaf
5 100 1 $a Larson, Jack.
6 903 $a 84758340
7 910 14 $a the cat $b duet for soprano and baritone
8 921 $a g schirmer
9 922 $a nyu
10 930 $a jack larson
11 940 $a eng
12 942 $a 234
13 943 $a 198x
14 944 $a cm
15 950 1 $a thomson, virgil $d 1896
All text is normalizedSubjects are grouped
into
broad subject areas
Material type is coded
Publication date is by decade
Coauthor
VIAF
DDB/LC/OCLC
Enhanced Authority Record
VIAF
DDB/LC/OCLC
Strong Matching Attributes
A work (title) in common Common control numbers (ISBN, ISSN, or LCCN) Exact birth and death year Joint authors Name as subject
VIAF
DDB/LC/OCLC
Weaker Attributes
Only one of birth/death date(s) (allows some variation) Subject area of works (two levels) Format (books, films, musical scores, etc.) Language Publisher Partial title match
Date of publication Country Role (author, illustrator, composer, etc.) Format (books, films, musical scores, etc.)
VIAF
DDB/LC/OCLC
LCNAF PND100 Menzel, Adoph, $d 1815-1905 400 Menzel, Adolph $d 1815-1905
901 345816110 $9 1 901 345816110 $9 1
910 friedrich def grosse und sien fridenswerk $9 1
910 friedrich def grosse und sien fridenswerk $9 4
921 bruckman $9 1 921 bruckman $9 8
940 ger $9 27 940 ger $9 267
941 ill $9 1 941 ill $9 18
942 243 $9 7 942 243 $9 1
943 184x $9 1 943 184x $9 2
950 achenbach, sigrid $d 1944 $9 1 950 achenbach, sigrid $9 2
951 hambuger kunstahalle $9 2 951 hambuger kunstahalle $9 1
Exact name match with datesStandard Number
Exact title match
Publisher
LanguageRole
SubjectDecadeJoint aughor
Corporate name
VIAF
DDB/LC/OCLC
LC Names
Established Names
4,187,973
Names from Bib Records
3,440,706
Uncontrolled
Names
883,882
Orphaned
Names
1,631,149
Active
Established
Names
2,556,824
VIAF
DDB/LC/OCLC
DDB Names
Established Names
2,659,276
Names from Bib Records
2,319,829
Uncontrolled
(Undif’d)
Names
306,211
Orphaned
Names
645,658
Active
Established
Names
2,013,618
VIAF
DDB/LC/OCLC
Results
Matches 558,618 Complex Matches 70,797 Unique Matches 487,821
VIAF
DDB/LC/OCLC
VIAF File
DDB Names
2,659,276
LC Names
4,187,973
Common
558,618
(70% of potential)
VIAF
DDB/LC/OCLC
Next Steps
Move to incremental updates Start harvesting national files Bring up Web interface (to full files) Make OAI accessible Bring in new participants Handle non-Roman matching
Move to other types of authorities• Corporate names• Geographic names• …
VIAF
DDB/LC/OCLC
Stage 3: Build OAI Server
LCNAF
DDB/PND
OAI
Server(s)
VIAF
DDB/LC/OCLC
Stage 4: Ongoing maintenance
VIAF
DDB/LC/OCLC
Stage 5: Build End User Interface with Unicode displays
User’s cookie specifies Hangul is preferred.
Display 700 form, building on local system’s authority structure
Thank you
T. Hickeyhttp://errol.oclc.org/laf/n82-54463.html
ALA June 2006
Top Related