Pursuing PBCore: The Revitalization of a Schema and Community (AMIA 2014)
Transcript of Pursuing PBCore: The Revitalization of a Schema and Community (AMIA 2014)
PBCORE SURVEY
SURVEY GOALS: + Identify areas of need; + Develop a broad view of current practice across
user-groups; + Who’s using it and who isn’t? Why?
+ Understand hurdles (real and perceived) to implementation;
+ So that the Team can: + create new resources and training opportunities; + supplement existing resources; and + guide resources to resolve pain points &
encourage use.
139 RESPONDENTS: + 31 currently using PBCore
+ Guideline for describing/cataloging + Data model for custom database/application + Exchange mechanism between applications/
organization + 72 not currently using PBCore
+ 36 did not indicate use
GOOD NEWS, EVERYONE! We use PBCore: + to create records internally in a standardized way, &
share records with users and other organizations.
+ to share PBCore XML with our long term repository.
+ to map incoming descriptive and technical metadata into our media asset management system.
+ to describe assets that have multiple copies in different formats, & items that contain multiple parts.
+ as framework for development of internal standard.
+ for its extensibility and granularity.
(CONSTRUCTIVE) CRITICISM: + Difficulty navigating website, GitHub. + No examples to work from.
+ What does ‘good’ PBCore look like??? + No best practices / lack of crosswalks
+ Flexibility is intimidating. (“It does too much.”)
+ Vocabularies: “bad / confusing / inconsistent.” + Inadequate/unclear definitions
+ Not helpful to lay-people + Steep learning curve
POINTS FOR CONSIDERATION: + “Not a standard in academic libraries” / “Not
appropriate for libraries” + Better used for digital assets, not analog + MARC and EAD better suited for AV collections + Switch from one standard to another is a headache + “Intended specifically for broadcasters” + “Dublin Core can be manipulated to fit all of our needs"
IT’S LARGELY A MARKETING PROBLEM.
I’ve never heard of it. It’s too intense! It’s too complicated! It’s too time-consuming!
I don’t use PBCore because:
SO, WHAT NOW? SUBCOMMITTEE ACTIVITIES IN RESPONSE.
ADDITIONAL SUGGESTIONS?
The Internet Made Me Do It: Confessions of an Accidental Archivist
Jack Brighton!director of new media & innovation!
Illinois Public [email protected]!
@jackbrighton
WILL AM-FM-TV
Sir Tim Berners-Lee
Marc Andreeson & Eric Bina
The Web changed things a bit
RealAudio anyone?
Windows Media?
QuickTime?
Gone in a Flash
Content Management
“Most of our assumptions have outlived their uselessness.”!Marshall McLuhan
Time for some new assumptions
Where Dublin Core breaks down with audio and video
The One-to-One principle is inefficient and/or messy when dealing with multiple manifestations of a media asset. (See: Steven J. Miller, The One-To-One Principle: Challenges in Current Practice)
Key PBCore Concepts:
✤ Asset: the media item in the abstract!
✤ Instantiation: a physical or digital instance of the asset!
✤ An Asset can have one of more Instantiations
WILL Radio interview with John Brady Kiesling, former U.S. diplomat, on “The War In Iraq: U.S. Foreign Policy And The Crisis Of International Legitimacy,” Focus 580, September 23, 2005 Interviewer: Jack Brighton Producer: Harriet Williamson
WILL Radio interview with John Brady Kiesling, former U.S. diplomat, on “The War In Iraq: U.S. Foreign Policy And The Crisis Of International Legitimacy,” Focus 580, September 23, 2005 Interviewer: Jack Brighton Producer: Harriet Williamson
Content
Metadata
A media object:
Published as a web page:
<?xml version="1.0" encoding="UTF-8" ?> - <rss xmlns:itunes="http://www.itunes.com/DTDs/Podcast-1.0.dtd" version="2.0"> - <channel> <title>Focus 580 on WILL-AM</title> <description>An intelligent interview program on current affairs</description> <link>http://www.will.uiuc.edu/am/focus</link> <language>en-us</language> <copyright>Copyright 2005 University of Illinois</copyright> <itunes:image href="http://www.will.uiuc.edu/am/focus/images/focuspodcast.jpg" /> <lastBuildDate>Fri, 23 Sep 2005 12:10:00 CST</lastBuildDate> <pubDate>Fri, 23 Sep 2005 12:10:00 CST</pubDate> <docs>http://blogs.law.harvard.edu/tech/rss</docs> <webMaster>[email protected]</webMaster> - <item> <title>The War In Iraq: U.S. Foreign Policy And The Crisis Of International Legitimacy</title> <link>http://will.uiuc.edu/am/focus</link> <description>Interview with John Brady Kiesling, former U.S. diplomat</description> <enclosure url="http://www.will.uiuc.edu/willmp3/focus050923a.mp3" length="24767532" type="audio/mpeg" /> <category>Current Events</category> <pubDate>Fri, 23 Sep 2005 12:10:00 CST</pubDate> </item> </channel> </rss>
Same object as an RSS/podcast feed:
RSS Feed viewed in Firefox
RSS Feed viewed in iTunes
<?xml version="1.0"?> <!DOCTYPE rdf:RDF SYSTEM "http://dublincore.org/documents/2002/07/31/dcmes-xml/dcmes-xml-dtd.dtd"> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/"> <rdf:Description rdf:about="http://will.atlas.uiuc.edu/focus580/interview/focus050923a/"> <dc:title> The War In Iraq: U.S. Foreign Policy And The Crisis Of International Legitimacy </dc:title> <dc:creator> WILL Public Media - http://will.illinois.edu </dc:creator> <dc:subject> WILL; public affairs; public radio; interviews; talk show; NPR; Illinois; Indiana; University of Illinois; Iraq; United States; Foreign Policy </dc:subject> <dc:description> Interview with John Brady Kiesling, former U.S. Diplomat </dc:description> <dc:publisher> WILL-AM, University of Illinois </dc:publisher> <dc:contributor> Jack Brighton, interviewer </dc:contributor> <dc:type> Sound </dc:type> <dc:language> en </dc:language> <dc:relation> http://will.illinois.edu/media/focus050923a.mp3 </dc:relation> <dc:rights> c 2008 University of Illinois </dc:rights> </rdf:Description> </rdf:RDF>
Same thing as a Dublin Core record:
<?xml version="1.0" encoding="UTF-8"?> <PBCoreDescriptionDocument xmlns="http://www.pbcore.org/PBCore/PBCoreNamespace.html"> <pbcoreAssetType ref="http://pbcore.org/vocabularies/pbcoreAssetType#program">Program</pbcoreAssetType> <pbcoreAssetDate>Fri, 23 Sep 2005 12:00:00 CST</pbcoreAssetDate> <pbcoreIdentifier source="Illinois Public Media" ref="http://will.illinois.edu">focus050923a</pbcoreIdentifier> <pbcoreTitle titleType="Main">The War In Iraq: U.S. Foreign Policy And The Crisis Of International Legitimacy</pbcoreTitle> <pbcoreSubject subjectType="topic" source="Illinois Public Media Subjects">Iraq</pbcoreSubject> <pbcoreSubject subjectType="topic" source="Library of Congress Subject Headings" ref="http://id.loc.gov/authorities/sh2008123892#concept">Insurgency--Iraq</pbcoreSubject> <pbcoreDescription descriptionType="Summary" ref="http://pbcore.org/vocabularies/pbcoreDescription/descriptionType#summary">WILL Radio interview with John Brady Kiesling, former U.S. diplomat, on “The War In Iraq: U.S. Foreign Policy And The Crisis Of International Legitimacy”</pbcoreDescription> <pbcoreGenre source="PBCore" ref="http://pbcore.org/vocabularies/pbcoreGenre#interview">Interview</pbcoreGenre> <pbcoreCoverage> <coverage source="ISO-3166" ref="http://www.geonames.org/countries/IQ/iraq.html">IRQ</coverage> <coverageType>Spatial</coverageType> </pbcoreCoverage> <pbcoreCreator> <creator affiliation="Illinois Public Media">Jack Brighton</creator> <creatorRole>Interviewer</creatorRole> </pbcoreCreator> <pbcoreCreator> <creator affiliation="University of Illinois" ref="http://will.illinois.edu/am">WILL-AM</creator> <creatorRole>Production Unit</creatorRole> </pbcoreCreator> <pbcoreContributor> <contributor ref="http://en.wikipedia.org/wiki/Brady_Kiesling">John Brady Kiesling</contributor> <contributorRole source="PBCore" ref="http://pbcore.org/vocabularies/contributorRole#interviewee">Interviewee</contributorRole> </pbcoreContributor> <pbcorePublisher> <publisher ref=”http://illinois.edu”>University of Illinois</publisher> <publisherRole ref="http://pbcore.org/vocabularies/publisherRole#copyright-holder">Copyright Holder</publisherRole> </pbcorePublisher> <pbcoreRightsSummary> <rightsSummary>Permission granted by the copyright holder to stream and download for non-profit and educational use under a Creative Commons Attribution-NonCommercial-ShareAlike 2.0 Generic (CC BY-NC-SA 2.0)</rightsSummary> <rightsLink>http://creativecommons.org/licenses/by-nc-sa/2.0/</rightsLink> </pbcoreRightsSummary> <pbcoreInstantiation> <instantiationIdentifier source="Illinois Public Media" ></instantiationIdentifier> <instantiationDate dateType="date created">Fri, 23 Sep 2005 12:00:00 CST</instantiationDate> <instantiationDate dateType="date issued">Fri, 23 Sep 2005 12:10:00 CST</instantiationDate> <instantiationDigital>mp3</instantiationDigital> <instantiationLocation>http://will.illinois.edu/media/focus050923a.mp3</instantiationLocation> <instatiationMediaType>Sound</instatiationMediaType> <instantiationGenerations>Copy: access</instantiationGenerations> <instantiationFileSize unitsOfMeasure="bytes">24767532</instantiationFileSize> <instantiationDuration>00:52:08</instantiationDuration> </pbcoreInstantiation> </PBCoreDescriptionDocument>
Same thing with more detail as a PBCore record:
What are we really doing?
• Cataloging media objects!
• Creating sharable metadata
Stupid CMS Tricks
Slightly Less Stupid CMS Tricks
We have to create !good data
We can export and !import the data
Principles:
• Catalog all media objects!
• No more data silos!
• Systems exchange data with other systems!
• They don’t have to be the same systems!
• Standards make exchange possible
Sustaining and sharing the data is our only hope
The Internet Made Me Do It: Confessions of an Accidental Archivist
Jack Brighton!director of new media & innovation!
Illinois Public [email protected]!
@jackbrighton
WNYC%NEWS%2012%08%12%78254.wav3
WNYC%NEWS%2012%08%12%78254.wav3
WNYC%NEWS%2012%08%12%78254.wav3
BROADCAST* ARCHIVE*
WEB* REPOSITORY*
WNYC%NEWS%2012%08%12%78254.wav3
Transcripts*Air*Dates*Reporter*Name****
Contributors*names*Abstracts*Tags*DescripAons*Images*Video*MP3*metadata*Comments*
HiERes*Digital*Files*File*LocaAon*Bit*Rate*DuraAon*File*Size*Original*File*Name*Raw*Audio*
Analog*Elements*Rights*Management*Associated*Formats*Related*Content*Taxonomies*Controlled*Vocabulary**BROADCAST* ARCHIVE*
WEB* REPOSITORY*
WNYC%NEWS%2012%08%12%78254.wav3<pbcoreIdenAfier*source="WNYC*Archive*Catalog">9969</pbcoreIdenAfier>*<pbcoreTitle*AtleType="Series">Don*Mathisen</pbcoreTitle>*<pbcoreTitle*AtleType="CollecAon">WNYC</pbcoreTitle>*<pbcoreSubject>LA*Riots</pbcoreSubject>*<pbcoreGenre*source="PBCore*Genre*Picklist">News</pbcoreGenre>*<pbcoreContributor><contributor>Yarrow,*Peter,*1938E</contributor></pbcoreContributor>*<pbcoreRightsSummary><rightsSummary>WNYC</rightsSummary></pbcoreRightsSummary>*<pbcoreInstanAaAon>*<instanAaAonIdenAfier*source="WNYC*Media*Archive*MDB">9969.2</instanAaAonIdenAfier>*<instanAaAonGeneraAons>PreservaAon</instanAaAonGeneraAons>*<instanAaAonDuraAon>00:02:35</instanAaAonDuraAon>**
ARCHIVE*
WNYC%NEWS%2012%08%12%78254.wav3<item><Atle>DocumenAng*Apartheid*in*South*Africa*</Atle>*<link>hcp://www.wnyc.org/story/theEleonardElopateEshowE2014E09E30/</link>*pubDate>Tue,*30*Sep*2014*00:00:00*E0400</pubDate>*<guid>hcp://www.wnyc.org/story/theEleonardElopateEshowE2014E09E30/</guid>*<category>apartheid</category>*<category>books</category>*<category>life</category>*<category>middle_east</category>*<category>technology</category>*<source*url="hcp://www.wnyc.org/shows/lopate/">The*Leonard*Lopate*Show</source>*</item><*
WEB*
WNYC%NEWS%2012%08%12%78254.wav3<?xml*version="1.0"*encoding="ISOE8859E1"?>*<ENTRIES>**<ENTRY>***<NUMBER>94012</NUMBER>***<CLASS>News</CLASS>***<TITLE>news20140929*nj*police*diversity*gonzalez</TITLE>***<FILENAME>DA0_65D154848949419DB589751A0B5EEC8D.WAV</FILENAME>***<GENERATOR>DBM</GENERATOR>***<CREATOR>WAYNE</CREATOR>***<DATE>2014E09E30</DATE>***<DATUM>2014E09E30</DATUM>*
Repository*
WNYC%NEWS%2012%08%12%78254.wav3SHULMISTER:**Listen*up,*New*Yorkers.*You*want*believe*that*if*you*make*it*here*you*can**make*it*anywhere?*How*many*of*you*have*spent*a*summer*in*Birmingham?**AMBI:*[Bells*up*in*the*clear*for*a*moment]**[Something*like:*I*went*to*Southside,*a*neighborhood*with*lots*of*restaurants*and*bars*to*get*s*ome*advice*for*y’all.*It’s*the*only*place*to*find*people*out*on*the*street,*especially*in**the*summer,*because*most*people*just*stay*inside*and*move*from*one*air**condiAoned*zone*to*the*next.]***MONTAGE3ENDS:3….3“I’m3not3buying3it.”*
Broadcast*
Website*CMS*RSS*metadata*>*PBCore*XML*
DAM*metadata*>*PBCo
re*XML* PBCore3
Find*the*persistent*idenAfier*in*each*system*that*connects*the*metadata*to*the*digital*object.**Map*each*metadata*set*to*PBCore*XML*Combing*all*available*metadata*into*an*single*asset*record*in*the*archives*database.****
hRp://cavafy.wnyc.net/assets?q=2014%05%09&x=0&y=03
We*use*PBCore*to*normalize*and*consolidate*metadata*generated*about*a*single*digital*object’s*disparate*and*unconnected*systems,*in*order*to*create*a*more*complete*object*record*for*the*purpose*of*digital*preservaAon.**
Blueprint is not a title type: Confessions of an ambivalent PBCore user
Mary Lynn Miller, UGA AMIA conference, October 10, 2014
Background & context
The Peabody Awards Collection
" Given by the Grady College of Journalism
" Radio, TV & Web " Archive came to
libraries in about 1978 " About 1,000 new
entries annually " 70,000 titles?
The Peabody Database
" Intellectual control " Excel " MARC records
" Change to PARC, an “Ultimate” database " Online entries " Merging old excel files
Building the catalog
" Goal 1: keeping up " Goal 2: end backlog " Goal 3: national
standard
" Current status: online Ultimate database somewhere between entry form & MARC
Other Media Archives holdings
The barcode is king
Data: First, do no harm
DC.Creator Chamberlain, Richard
PBCore.Creator Chamberlain, Richard
PBCore.CreatorRole actor
Peabody Richard Chamberlain
(Cast (Dr. Kildare))
Multi-part instantiation ID: 2011_2011039_ent_1-2 Year: 2011 Entry Number: 2011039 Entry Category: Entertainment (ENT) Part: 1-2 Total Parts: 2 Title: Appropriate Adult Instantiation IDs: 2011039ent-1-arch 2011039ent-2-arch
ID: 2011_2011004_ent_1 Year: 2011 Entry Number: 2011004 Entry Category: Entertainment (ENT) Part: 1 Total Parts: 3 Title: Portlandia. [No. 1, 2011-01-21], Farm Instantiation IDs: 2011004ent-1-arch ID: 2011_2011004_ent_2 Year: 2011 Entry Number: 2011004 Entry Category: Entertainment (ENT) Part: 2 Total Parts: 3 Title: [No. 2, 2011-01-28], A Song for Portland Instantiation IDs: 2011004ent-2-arch
Vocabularies