BIBFRAME: Why? What? Who? - The Library of Congress pap… · Web viewThe BIBFRAME visualization...
Transcript of BIBFRAME: Why? What? Who? - The Library of Congress pap… · Web viewThe BIBFRAME visualization...
BIBFRAME: Why? What? Who?
BIBFRAME is short for Bibliographic Framework. It began as an LC initiative in 2011 to transition
from a legacy, MARC-based environment to one that fully integrates with and reaps the benefits of the
World Wide Web. BIBFRAME is the foundation for the future of bibliographic description1; it will
become the primary means of bibliographic data2 exchange; and it will replace the MARC Format.
BIBFRAME’s primary benefit to the community of knowledge seekers is its ability to enhance
information exploration through the use of links and World Wide Web technologies, creating a virtual
“stack browsing” experience while improving on physical browsing.
By integrating bibliographic data into the linked and networked environment of the World Wide
Web, BIBFRAME will enhance information discovery and promote knowledge navigation. It will reduce
the costs associated with traditional cataloging because it will lessen the time associated with
maintaining authority data.
BIBFRAME defined
BIBFRAME relies on relationships between resources, not on bibliographic description alone. In
the BIBFRAME environment, we will not refer to bibliographic “records” in the traditional sense of the
word. BIBFRAME relies on controlled identifiers to identify entities, not on controlled strings of data. A
controlled identifier is a number or code that uniquely identifies an entity, for example, “n 84125431”
identifies “United States. Congress. Senate. Committee on Armed Services. Task Force on Selected
Defense Procurement Matters.” The text in quotes is a controlled string of data. Such controlled strings
of data are the hallmark of a MARC record.
1 Bibliographic description: the process of describing a resource held by a library through transcription of elements such as title, edition, size, etc., that assist in identifying the resource.
2 Bibliographic data: bibliographic description in a machine-readable format. 1
BIBFRAME: Why? What? Who?
BIBFRAME: Why is it important?
BIBFRAME opens the world community to the wealth of authoritative bibliographic data, which
is essential to the access of knowledge and which has been so carefully curated and managed by
libraries for generations. Library bibliographic data is built upon a solid infrastructure of authoritative
names and subjects. It is reliable, consistent, and “clean,” thanks to its use of regulated standards. But
it is encased in a data format that is not easily understood or easily deployed by non-library
professionals. BIBFRAME seeks to lower the access barrier, partly by adopting contemporary data
practices but more by fostering an environment that is not just on the World Wide Web but part of the
World Wide Web. With BIBFRAME, the library community has an opportunity to make its controlled and
well-crafted bibliographic data accessible to a global audience. Wider accessibility of a library’s
bibliographic data makes the library’s resources and holdings known and available to “outsiders.” If one
of those outsiders, for example, is Google, then exposing library bibliographic data in this way can
translate into more relevant search results for users, and more patrons visiting library collections.
BIBFRAME is very much a modernization effort.
And, as with most modernization efforts, this effort requires a new way of thinking and doing
things. By integrating our bibliographic data into the World Wide Web, sharing it in non-traditional,
non-MARC ways, and embracing the use of links to connect resources, libraries will create an
atmosphere of knowledge exploration that cannot be achieved using the MARC Format to expose our
bibliographic data.
The MARC Format is the machine-readable standard currently used by the library community.
Created in the mid-1960s, it has been in continuous use for nearly fifty years and touches every aspect
of library technology ranging from traditional nonprofit library catalogs to commercial library vendors.
2
BIBFRAME: Why? What? Who?
The library community uses the standard to record and share bibliographic data, and the MARC record is
the “package” that contains and communicates the data. A MARC record is an aggregation of
information about a described resource and its physical carrier. See Appendix A for an illustration of a
typical MARC record.
The fact that the MARC Format was created for a defined set of users—the library community—
cannot be ignored. Although the MARC Format has served the library community admirably for nearly
half a century, technological advancements in the way all data can be created and shared has eclipsed
the once revolutionary ability of this format to share its bibliographic data and has left the library
community isolated. Information retrieval systems such as Google cannot harvest bibliographic data
encoded in MARC and make it accessible in multiple ways because of the limitations of the format.
When an information retrieval system cannot interpret MARC bibliographic data, it might present it in
its raw form, not coupled with anything to increase its value to the patron, or simply fail to interpret the
data. BIBFRAME will present bibliographic data in such a way that information retrieval systems can
make semantic sense of it, so that the bibliographic data can be presented to a patron in an enhanced
and linked manner, whether the information retrieval system is owned by Google or owned by the
Library of Congress.
BIBFRAME: What will the final results be?
Rather than collocating data into a record, the BIBFRAME data will be decentralized with links to
data replacing the MARC strings of data. The same resource described in the static, two-dimensional
MARC record in Appendix A becomes a springboard for knowledge exploration when visualized through
the BIBFRAME model. Appendix B shows a visualization of BIBFRAME’s powerful use of links to
illuminate relationships among resources. The BIBFRAME visualization shares some of the attractions of
3
BIBFRAME: Why? What? Who?
browsing open stacks in a library—where the patron is in search of a particular resource, but in the
process, is led to other related and relevant resources on the shelf. BIBFRAME will promote a more
systematic and efficient retrieval and exploration of library resources than available when physically
browsing the stacks or consulting a MARC-based catalog.
BIBFRAME: Who benefits?
First and foremost, the library community benefits from BIBFRAME as this system presents a new
data model for libraries. It provides a contemporary way for libraries to realize cost savings in creating
bibliographic data to share and exchange among their peers. BIBFRAME relies on links to avoid the
duplicative efforts of manually creating multiple individual records for the same resource. By creating a
single resource description in BIBFRAME for a work, and then linking that description to all versions of
the work, and to other related resources, such as translations, movies, dramatizations, music, etc.,
libraries will be able to describe more resources, quicker and with increased efficiency. See Appendix B
for an example. By relying on links to identify relationships, BIBFRAME will endow bibliographic data
with entirely new dimensions that will be a benefit to both libraries and information seekers. Libraries
will be better able to reveal the depth and breadth of their holdings, illuminating resources and
connections that are vital to the community of knowledge seekers. BIBFRAME can help a local public
library publish its holdings for a particular resource in such a way that if a reader searches for that
resource in a Google search, the search engine could highlight the library’s holdings. See Appendix C for
an example.
The library community will also benefit from BIBFRAME’s ability to make bibliographic work
relevant in the twenty-first century, with data communication possibilities beyond the scope of the
4
BIBFRAME: Why? What? Who?
MARC Format. BIBFRAME employs resources beyond the library community. It enables librarians to
embrace a wider range of data-sharing formats and technologies, with a reciprocal increase in choices of
methodologies to employ in sharing library data. Authority work has historically been one of the most
costly aspects of bibliographic description. BIBFRAME’s use of controlled identifiers over the MARC
Format’s reliance on controlled text strings for entity description will lessen considerably the time and
costs associated with maintenance of authority data. One controlled string of data may appear in
thousands of MARC records. If that controlled string of data changes, a fairly common occurrence, all
MARC records containing that controlled string of data need to be changed as well. Maintenance of
MARC records can be costly and time-consuming. A controlled identifier does not change, even if the
controlled string of data associated with that identifier changes; BIBFRAME thus reduces the time and
the costs of bibliographic maintenance. In a similar way, BIBFRAME will integrate into the current model
of cooperative bibliographic data and will provide libraries worldwide with the means to increase the
visibility of their collections. BIBFRAME’s use of controlled identifiers, which are language-neutral, over
MARC controlled text strings, which are language-dependent, facilitates wider international sharing of
bibliographic description, with the same beneficial return on investment that results from the use of
controlled identifiers over controlled text strings.
There are many unrevealed library resources backed by authoritative and bibliographic data that
deserve to be brought to the attention of the community of knowledge seekers. BIBFRAME is not just on
the World Wide Web, it is a part of the World Wide Web and through its use of links and World Wide
Web technologies, BIBFRAME enables this rich and authoritative bibliographic data embedded in MARC
records and in library catalogs to be harvested by Web-based search engines and made more accessible
to the community of knowledge seekers. With this wider exposure of bibliographic data, the community
5
BIBFRAME: Why? What? Who?
of knowledge seekers will not need to connect directly with a particular library; the library’s data will be
brought directly to the community. The virtual “stack browsing” experience that results from this wider
exposure can result in unimaginable discoveries. When the well-crafted and authoritative data that
librarians have been creating for years is joined with the technology of the World Wide Web and existing
linking models, the possibilities for data sharing and knowledge dispersion are enhanced and
augmented.
Paul Frank
Cooperative and Instructional Programs Division
May 1, 2014
I would like to thank two LC staff members who assisted me in writing this paper.
Judith P. Cannan, Chief, Cooperative and Instructional Programs Division, provided the focus and a degree of organizational integrity that was absent in the paper’s first drafts. She spent a considerable amount of time with me in review and editing, all at a very busy time in her professional and personal life. I am extremely grateful to her for her time and her contributions to the final paper.
Kevin Ford, Network Development and MARC Standards Office, reviewed the paper and enthusiastically provided me with voluminous additional original information and suggestions that are incorporated in part here. He has a wealth of knowledge of BIBFRAME and I hope that the information I could not include here might be made public in other venues in the future. I am also grateful to Mr. Ford for his article LC’s Bibliographic Framework Initiative and the Attractiveness of Linked Data, in Information Standards Quarterly, Spring/Summer 2012, vol. 24, Issue 2/3, ISSN 1041-0031, p. 46-50. I have used parts of Mr. Ford’s article in preparing this paper.
6
BIBFRAME: Why? What? Who?
Appendix A
A MARC bibliographic record for the resource Tolstoy’s War and Peace, with MARC encoding elements highlighted. One resource searched, one resource identified:
7
BIBFRAME: Why? What? Who?
Appendix B
A visualization of BIBFRAME’s powerful use of links to illuminate relationships among resources, using Tolstoy’s War and Peace:
A search for the novel War and Peace will reveal not only all editions of the novel, but also reveal the related translations, films, television programs, musical works, art work, etc., and even related resources that have similar subject content.
Note the Instance links in the visualization. Two editions of War and Peace require two MARC records, each of which must duplicate the title, author, subjects, and other information. With BIBFRAME, one record is created representing War and Peace, thereby recording the title, author, and subjects only once. Two smaller, non-duplicative records would be created, one for each of the two editions, and then linked to the main work War and Peace. Information will have only been entered once. In this way, using BIBFRAME, catalogers would nominally be able to describe more resources not only more efficiently but also more quickly because we are capitalizing on links and not relying on duplicative effort.
8
BIBFRAME: Why? What? Who?
Appendix C
BIBFRAME can help a local public library publish its holdings for a particular resource in such a way that if someone searches for that resource in a Google search, the search engine could highlight the library’s holdings.
Here is a search result that you might see today:
But in the BIBFRAME environment, a different potential search result:
9