RDA & the New World of Metadata
-
Upload
diane-i-hillmann -
Category
Technology
-
view
1.115 -
download
0
description
Transcript of RDA & the New World of Metadata
RDA & The New World of Metadata
Diane I. HillmannAMIGOS Conference, Feb. 20, 2014
"Is RDA on Your RaDAr?"
AMIGOS RDA Conference 2014 2
What Is RDA?
• You’re already familiar with the Toolkit and the ‘rules’– You’ve heard about the problems with using the
RDA instruction with MARC (maybe even tried it)– RDA as a standard includes both the instruction and
the vocabularies • RDA as data has a very different model than
MARC • MARC was originally designed to print cards• RDA was built around FRBR
2/20/14
AMIGOS RDA Conference 2014 3
This Transition is Hard
• The transition is not just from AACR2 to RDA• It’s also about: – Different views of the world– Different data models– Different distribution strategies
• Linked data is part of this transition • Re-thinking our basic assumptions is critical
2/20/14
AMIGOS RDA Conference 2014 4
Model of ‘the World’ /XML
• XML (and MARC21) assume a 'closed' world (domain), usually defined by a schema:– "We know all of the data describing this resource.
The single description must be a valid document according to our schema. The data must be valid.”
–XML's document model provides a neat equivalence to a metadata 'record’ (and most of us are fairly comfortable with it)
2/20/14
AMIGOS RDA Conference 2014 5
Model of ‘the World’ /RDF
• RDF assumes an 'open' world:– "There's an infinite amount of unknown data
describing this resource yet to be discovered. It will come from an infinite number of providers. There will be an infinite number of equally valid descriptions. Those descriptions must be consistent."
– RDF's statement-oriented data model has no notion of 'record’ (rather, statements can be aggregated for a fuller description of a resource)
2/20/14
AMIGOS RDA Conference 2014 6
Linked Data is Inherently Chaotic
• Requires creating and aggregating data in a broader context– There is no one ‘correct’ record to be made from
this data, no objective ‘truth’• This approach is different from the cataloging
tradition– BUT, the focus on vocabularies is familiar
• Linked data relies on the RDF model
2/20/14
AMIGOS RDA Conference 2014 7
The New Data Management
• Managing data at the statement level rather than record level
• Emphasis on evaluation coming in and provenance going out
• Shift in human effort from creating standard cataloging records to knowledgeable human intervention in machine-based processes
• Extensive use of data created outside libraries• Intelligent re-use of our legacy data and
redistribution of our data more widely2/20/14
AMIGOS RDA Conference 2014 8
What’s Still New About RDA?New version of the vocabularies brings together all we’ve learned since 2008
New ‘branded’ namespace
Commitment to synchronize Toolkit & Vocabs
Optimized for the Semantic Web and linked data
2/20/14
AMIGOS RDA Conference 2014 9
Understanding The New RDA
• What’s [still] new about RDA? • How should we look at its current progress?• Is RDA what we need? Should we ‘wait’ for
something that might be ‘better’?• How should we understand the interplay
between RDA and technological changes going on concurrently?
2/20/14
AMIGOS RDA Conference 2014 10
Breaking News
• The RDA Vocabularies now have an updated version, and a new namespace– Old RDA Vocabularies: RDVocab.info– New RDA Vocabularies: RDARegistry.info
• Old element vocabularies never published; All new vocabularies are published
• Value vocabularies remaining in the older namespace (for now)
2/20/14
AMIGOS RDA Conference 2014 112/20/14
AMIGOS RDA Conference 2014 12
Ringing Changes
• Why deprecate?– The RDVocab elements were never formally published, but
they have been used– Deprecation is better than deletion in this space – Element sets are reorganized to simplify element names and
better integrate relationships with other FRBR elements
• else is new?– Verbalized element names (‘has’, ‘is’, etc.)– Explicit reciprocals– Different URI strategy
2/20/14
AMIGOS RDA Conference 2014 14
Why GitHub?
• GitHub is a widely used repository with tools that enable services and documentation to be created and managed more easily – In some cases, human readable and technical
versions can be created automatically– Enables detailed version information to be
managed by machines and viewed by users– Supports easily generated output for use by other
systems and users
2/20/14
AMIGOS RDA Conference 2014 15
Constrained & Unconstrained Properties: What’s the Difference?
• The FRBR ‘bounded’ properties should be seen as the official JSC-defined RDA basic Application Profile for libraries
• Extensions and mapping should be built from the unconstrained properties– Unconstrained vocabularies necessary for use in domains where FRBR
not assumed or inappropriate– Mapping from vocabularies not using the FRBR model directly to ones
that do (and back) creates serious problems for the ‘Web of Data’
• Differences make vocabularies able to express library knowledge in the context of the Semantic Web
2/20/14
AMIGOS RDA Conference 2014 16
What’s Important About the New RDA?
URI strategy optimized for multiple languages and both human and machine use
Sets reorganized to bring the relationships into the element sets
RDA enables data to be managed for both the closed world of library and local data and the open world of linked data
[Image by Reed Sturtevant, FLICKR]
2/20/14
AMIGOS RDA Conference 2014 17
Big Challenges/Big Ideas
• Records are still important but not as we’ve used them in the past– We might want to think about records as the
instantiation of a point of view– News: traditional library data has a point of view
• MARC required consensus because of limitations built into the technology – For any data in statements destined for the Semantic
Web, we need provenance, so we know “Who sez?”
2/20/14
AMIGOS RDA Conference 2014 182/20/14
Open Metadata Registry
* Built in User Interface designed for humans
• Detailed history info on all transactions
• Downloadable options* Change feed
* Limited notifications* Limited options for
additional documentation
RDA Vocabularies on Github
* Easier updating of documentation and technical versions• More flexible
management options (Vocabularies can be
managed locally using a variety of tools)
• Distributed version control
• Issues management
The OMR and Github Will Work in Tandem
AMIGOS RDA Conference 2014 192/20/14
Download and view options
AMIGOS RDA Conference 2014 202/20/14
Available Technical Formats
AMIGOS RDA Conference 2014 212/20/14
AMIGOS RDA Conference 2014 222/20/14
AMIGOS RDA Conference 2014 232/20/14
Domains and Ranges are Classes
AMIGOS RDA Conference 2014 242/20/14
AMIGOS RDA Conference 2014 25
Vocabulary Extension
• The inclusion of unconstrained properties provides a path for extension of RDA into specialized library communities, non-library communities and to better support local needs– Other communities may have a different notion of how
FRBR ‘aggregates’ (For example, a colorized version of a film may be viewed as a separate work)
– Non-libraries may not wish to use FRBR at all– Local users may have additional, domain-specific
properties to add, that could benefit from a relationship to the RDA properties
2/20/14
AMIGOS RDA Conference 2014 26
rdau:isAdaptedAs
rdau:isAdaptedAsARadioScript
hasSubproperty
2/20/14
AMIGOS RDA Conference 2014 27
rdau:isAdaptedAs
rdau:isAdaptedAsARadioScript
KidLit:isAdaptedAsAPictureBook
hasSubproperty
hasSubproperty
Extension using Unconstrained Properties
2/20/14
AMIGOS RDA Conference 2014 28
rdau:isAdaptedAs
rdau:isAdaptedAsARadioScript
KidLit:isAdaptedAsAPictureBook
hasSubproperty
hasSubpropertyKidLit:isAdaptedAsAChapterBook
hasSubproperty
Extension using Unconstrained Properties
2/20/14
AMIGOS RDA Conference 2014 29
What’s our Distribution Model?We don’t know what you want, so choose!
We know more about what you want than you do. Here it is!
2/20/14
AMIGOS RDA Conference 2014 30
Libraries as Data Publishers & Consumers
• Data from library ‘publishers’ should look like a supermarket—lots of choices, with decisions made by consumers– Right now we seem to be operating as Soviet
bakeries – This is not what open linked data is supposed
to be doing for us• "Be conservative in what you send, liberal
in what you accept”—Robustness Principle
2/20/14
AMIGOS RDA Conference 2014 31
Where You Start Affects Where You End Up
• Simple metadata is more useful as output than input– The ‘long tail’ of MARC’s lesser used properties was built
up over decades and shouldn’t be discarded– It’s easier to dumb down than smarten up (and not as
lossy)
• Dublin Core and MARC are examples of starting simple and trying to add on– MARC 21 went well beyond AACR2 in its scope– Dublin Core successful as a common mapping (most
schemas map to DC) but rarely sufficient by itself
2/20/14
AMIGOS RDA Conference 2014 32
Libraries as Data Publishers
• If we want people outside libraries to use our data, we need to offer them choices
• This strategy is supported by loss-less mapping of all of our legacy data– Not a pre-determined selection– Filtering best accomplished by data consumers, who
know what they need• This requires a new strategy for managing the data– RDA, as a rich metadata model based on sound Library
experience, is an excellent basis for this strategy
2/20/14
AMIGOS RDA Conference 2014 33
Libraries as Data Consumers
• As aggregators of relevant metadata content– Developing methods to gather and redistribute data without
necessarily re-creating OCLC– Modeling and documenting best practices in metadata
creation, improvement and exposure– Application profiles important in this effort
• As developers of vocabularies, exposing a variety of bibliographic relationships
• As innovators (particularly as part of the cultural metadata community) using social networks to enhance bibliographic description
2/20/14
AMIGOS RDA Conference 2014 34
Mapping Legacy Data for Re-distribution
• If we want data consumers to value our data, we should map it all– We can distribute limited ‘flavors’ as well, as we
gain experience and feedback• Current crosswalking strategies are based on:– One-time, inflexible, programmatic methods that
effectively hide the process from consumers– Assumptions that data must be improved at the
time it is crosswalked, or never
2/20/14
AMIGOS RDA Conference 2014 35
bibo:numPagesbibo:numVolumes
rdam:extent
isbd:”has extent”
dct:extent
rdam:extentOfText
dct:format
rdau:extent
rdau:extentOfText
dc:format
rdau:duration
rdae:duration
m21:M306__a
m21:M300
unim:U127__a
unim:U215__a
What We Mean by ‘Mapping’
2/20/14
AMIGOS RDA Conference 2014 36
Will This Shift Cost Too Much?
• We need to support efforts to invest in more distributed innovation and focused collaboration
• It’s the human effort that costs us– Cost of traditional cataloging is far too high, for increasingly
dubious value• Our current investments have reached the end of their
usefulness– All the possible efficiencies for traditional cataloging have
already been accomplished• Waiting for leadership from the big players costs
valuable time with no guarantees of results2/20/14
AMIGOS RDA Conference 2014
The Bottom Line
• Our big investment is (and has always been) in our data, not our systems
• Over many changes in format of materials, we’ve always struggled to keep our focus on the data content that endures, regardless of presentation format
• We are in a great position to have influence on how the future develops, but we can’t be afraid to change, or afraid to fail
372/20/14
AMIGOS RDA Conference 2014
Thank you! Questions?
Contact info: [email protected]
Metadata Matters: http://managemetadata.com/blog
382/20/14