ADVANCES IN LIBRARY DISCOVERY SERVICES
The State of the Art in 2011Marshall BreedingDirector for Innovative Technology and ResearchVanderbilt University LibraryFounder and Publisher, Library Technology Guideshttp://www.librarytechnology.org/http://twitter.com/mbreedingInternet Librarian 2011May 20, 2011
Abstract
Marshall Breeding will provide a look in to the next generation of library catalogs. The initial phase of next-generation catalogs extended beyond the capability of the ILS online catalog module with relevancy-based search, faceted navigation, and extended scope. The current wave of discovery systems extends search to Web-scale capacity, addressing library subscriptions of scholarly content at the article level in addition to local physical and digital collections.
Evolution of library collection discovery tools
Bound handwritten catalogs Card Catalogs Library online catalogs – OPACs Next-Gen Catalogs / Discovery interfaces Social Discovery Web-scale discovery services Comprehensive presentation layer
services
Bound Catalog
National Library of Colombia
Card Catalog
National Library of Argentina
Card Catalog
University of Kansas Library
Online Card Catalog
Salem International University
Computerized card catalog
Web-based online catalog
AquaBrowser
Summon
The ever-expanding data model
Online Catalog
Search:
Search Results
ILS Data
Discovery Interface
Search: Digital
Collections
ProQuest
EBSCOhost
…MLA
Bibliography
ABC-CLIO
Search Results
Real-time query and responses
ILS Data
Local Index
Meta
Search
En
gin
e
Web-scale Discovery
Search: Digital
Collections
ProQuest
EBSCOhost
…MLA
Bibliography
HathiTrust
Search Results
Pre-built harvesting and indexing
Con
solid
ate
d In
dex
ILS Data
Legacy ILS Model / Extended Discovery
`
API Layer
LMS
Con
solid
ate
d in
dex
Search Engine
Discovery ServiceSearch:
Digital Collectio
ns
ProQuest
EBSCOhost
…JSTOR
Other Resourc
es
Web-scale Search + Federated Search
Search: Digital
Collections
ProQuest
…MLA
Bibliography
ABC-CLIO
Search Results
Pre-built harvesting and indexing
Con
solid
ate
d
Index
ILS Data
FedSearch Non-
harvestable
Resources
Interim model to deal with resources not possible to harvest into consolidated index
Encore Synergy
Search: Digital
Collections
ProQuest
…Lo
cal
Index
ILS Data
Web
S
erv
ices
Local Index Results
Local Index Results
Remote Search Results
EBSCOhost
…MLA
Bibliography
ABC-CLIO
Encore Synergy
Social Discovery
Search:
Digital Collections
Web site data
…User
Contributed Content
Search Results
Loca
l Ind
ex
ILS Data
Unified Search Model
Search:
Digital Collections
Web site data
…
User Contributed Content
Search Results
Disco
very
In
dex
ILS Data
Consolidated
Indexes of Articles
Library Web Presence
Integrated Library System
Library Web site
SubjectGuides
Article, Databases,E-Book collections
Public Interfaces:
Presentation Layer
Con
solid
ate
d in
dex
Search Engine
Discovery ServiceSearch:
Digital Coll
ProQuest
EBSCO…
JSTOR
Other Resour
ces
New Library Management Model
`
API Layer
Library Management
System
LearningManageme
nt
LearningManageme
nt
Enterprise ResourcePlanning
Enterprise ResourcePlanning
StockManageme
nt
StockManageme
nt
Self-Check /
Automated Return
Self-Check /
Automated Return
Authentication
Service
Authentication
Service
Smart Cad /
Payment systems
Smart Cad /
Payment systems
Discovery from Local to Web-scale Initial products focused on technology
AquaBrowser, Endeca, Primo, Encore, VuFind Mostly locally-installed software
Current phase focused on pre-populated indexes that aim to deliver Web-scale discovery Summon (Serials Solutions) WorldCat Local (OCLC) EBSCO Discovery Service (EBSCO) Primo Central Encore with Article Integration
Social Discovery
Builds on modernized library catalog interfaces Strong emphasis on Web 2.0 concepts Users invited to contribute reviews, ratings,
preferences, reading lists, etc. User-supplied data becomes part of the
discovery process Users help each other to find interesting library
materials Example: Leverage use data for a
recommendation service of scholarly content based on link resolver data: Ex Libris bX service
Differentiation in Discovery
Products increasingly specialized between public and academic libraries
Public libraries: emphasis on engagement with physical collection
Academic libraries: concern for discovery of heterogeneous material types, especially books + articles + digital objects
Developments in Discovery 2011
Continued emphasis on Index-based search
Serials Solutions: Summon Ex Libris: Primo Central OCLC: WorldCat Local EBSCO: EBSCO Discovery Service [Innovative: Encore Synergy]
Adoption trends
Great interest by academic libraries in Summon, EDS, Primo Central, WorldCat Local
Public Libraries: BiblioCommons adopted by major municipal libraries and consortia
Vendor specific discovery: LS2 PAC, Enterprise, Encore, Axiel Arena, Infor Iguana
AquaBrowser currently loosing ground New SaaS version from Serials Solutions
Association of Research Libraries
www.librarytechnology.o
rg/arl-
discovery.pl
Pre-populated discovery indexes
New-generation interface Harvested local content
ILS metadata Institutional repositories, ETDs, Digital
Collection platforms Vendor-supplied indexes of library content
E-journals, databases, e-books Full-text and metadata corresponding to e-content
subscriptions Book collections beyond local library collections
The Battle of the Mega Index Working toward comprehensive representation of
potential library content: ~1 billion items Well within the thresholds of the capacity of
modern search engine technologies Apache SOLR used by most
Building the Index: Business strategies
Deals with publishers and providers to expose metadata and full-text for discovery
Interesting relationship among discovery service providers Publishing business: Serials Solutions (ProQuest),
EBSCO Technology business: Ex Libris, OCLC (?)
Serials Solutions: ProQuest content + growing array of third party content
EDS: EBSCOhost content + growing array of third party content
OCLC & Ex Libris: Indexes built entirely out of third party content
The Challenge for Open Source Open source discovery interfaces:
VuFind (Villanova University) Blacklight (University of Virginia)
No open content mega index Discovery has shifted from primarily a
technology product to a content-driven product
Discovery Services and Publishers Discovery services based on a central
index depend on publishers and other content providers to cooperate in providing access to metadata or full text data
Not a publishing model – Users access content through publisher site
What’s in the Index?
Important to understand what resources from a libraries collection components are represented or not in their discovery service
Point of differentiation in selecting a discovery service
Point of differentiation in selecting content
Open Discovery Initiative
Project underway to address issues related to information providers, discovery service providers, and libraries
Protocols for transfer of content Transparency of what is transferred and indexed Rights or restrictions on how discovery services use
content Initial meeting at ALA Annual Proposal under consideration by NISO
“Proposed New Work Item: Standards and Best Practices for Library Discovery Services Based on Indexed Search”
Summon: Unilateral transparency
Citations / Metadata > Full Text Citations or structured metadata provide
key data to power search & retrieval and faceted navigation
Indexing full-text of content amplifies access
Important to understand depth indexing Currency, dates covered, full-text or citation Many other factors
Discovery w/Full-text Book content
HathiTrust
HathiTrust:
HathiTrust will expose SOLR index to discovery providers (Summon, Primo Central, WorldCat Local, EDS)
Introduces full-text book search into discovery services
A total of 8.4 million volumes 4.6 million books 200,000 serial titles 3 billion pages of text
Challenge for Relevancy
Technically feasible to index hundreds of millions or billions of records through Lucene or SOLR
Difficult to order records in ways that make sense
Many fairly equivalent candidates returned for any given query
Must rely on use-based and social factors to improve relevancy rankings
From Discovery to Management Serials Solutions: Summon > Web-scale
management Solution OCLC: WorldCat Local > Web-scale
management Solution Ex Libris: Primo > Alma
Re-coupled Discovery?
Decoupled interfaces emerged from broken online catalogs Poor interfaces, inadequate scope
Inefficient integration between automation and discovery platforms
New wave of more tightly integrated suites: Alma > Primo Web-scale Management Services > WorldCat Local Serials Solutions Web-scale Management Solution >
Summon Still possible to decouple, but more effort, worse
results
Integration with e-book lending services
Current environment reflects weak integration: Library catalog populated with MARC
records representing e-book collection Library users linked into e-book vendor site Uses ILS patron authentication for patron
validation and authorization Need to move to deeper integration with
more seamless user experience
Device Agnostic
Next-Gen Library Catalogs
Marshall BreedingNeal-Schuman PublishersMarch 2010
Volume 1 of The Tech Set
Top Related