FRBR information exchange
description
Transcript of FRBR information exchange
FRBR information exchange
Thomas Hickey & Jenny TovesOCLC Research
Current FRBR information exchange
Sets of MARC-21 records• Both bibliographic and authority• Sometimes extended
pKeys Unique pKeys Lists of sets of control numbers xISBN web service superWork records
Some background
Our FRBRization has been done primarily at the work level• We have FRBRized OCLC WorldCat
• ~60,000,000 records• ~1,000,000,000 holdings• Used in Open WorldCat, FictionFinder now• Will be visible in FirstSearch displays this fall
• Norwegian BIBSYS records• Finish national bibliography (now in WorldCat)• Electronic thesis metadata
Processing done on a 24-node Beowulf Linux cluster
MARC 21 bibliographic data
Basic method of accepting information Other formats get mapped into it Fields we use:
• Author main entry• Titles• ISBN• Personal name added entries• Language
Extensions• BIBSYS use of 490 fields to indicate hierarchy
MARC 21 Authority data
Map personal names using cross references Map author-titles using cross references Fields we currently use
• 008 fixed field• 100, 130, 400
Extensions• Files of additional cross references
• Common title patterns• xISBN matching
pKeys
An author-title key for matching Derived from MARC-like records & authority data
ocm00019613 shakespeare, william\1564 1616/hamletocm00615676 /hamlet/shakespeare, william\1564 1616ocm14055779 hamlet motion picture 1948ocm00290352 /hamlet/ocm00290352
Unique pKeys
pKeys that have been sorted and counted
692 sw00008899 milton, john\1608 1674/poems
691 sw00255854 puccini, giacomo\1858 1924/tosca
690 sw00020874 chaucer, geoffrey\d 1400/canterbury tales
688 sw00237074 melville, herman\1819 1891/moby dick
682 sw03620985 china/laws etc
Lists of control numbers
sw00000089 00206765 01261413 00000089 01236648 03975229 08360541 07363127
sw00000169 00000169 01647333 00420563 10957239 05205626 02325844 07299473 08244692 08555721 24509677 02533498 03967788 24728032 10130242 04849080 09477230 23323184 22051264 38870301 54266609 56760701 08366329
sw00000182 00000182 00102731 sw00000201 00000201 02786659 sw00000210 00000210 09175561 sw00000245 00000245 34103639
xISBN web service
Takes an ISBN as input Returns list of ISBNs in associated work Significant processing
• Starts with control-number list of work-sets• Uses ISBNs to pull work-sets together• Allows fuzzy-matching on author/title• Ends up with consistent clusters
• In general larger than those in control-number list
xISBN examples
[0130188549, 0130188476]:
sw11067396 barnea, amir/agency problems and financial contracting
sw13096363 barnea, amir/agency problems on financial contracting
[000713407x, 0007126360, 0007134053, 0007134061, 0007126441]:
sw48486275 /collins new school dictionary/ocm48486275
sw49740193 /collins new school dictionary/ocm49740193
sw49740203 /collins new school dictionary/ocm49740203
xISBN XML response
<?xml version="1.0" encoding="UTF-8" ?> - <idlist> <isbn>000713407x</isbn> <isbn>0007126360</isbn> <isbn>0007134053</isbn> <isbn>0007134061</isbn> <isbn>0007126441</isbn> </idlist>
superWorks format
Developed for FictionFinder XML format Includes expression-level information
• All the information needed We are adapting it to the Curioser project
superWork record layout
pKey # manifestations, holdings, sw-id, control #s publication dates expressions
• expression• classes• language• authors• titles• subjects• components
• author, title, publication data
Summary
Simpler when only work-level relationships are needed
Even for work-level relationships, a number of different formats are useful
Information needed for an interface gets much more complicated