A Decade of Discovery: What We Know and Where We Will Go

Post on 30-Jun-2015

71 views 1 download

description

2014 Charleston Conference Friday, Nov 7, 11:45 AM

Transcript of A Decade of Discovery: What We Know and Where We Will Go

A Decade of Discovery: What We Know and Where We Will Go

Lisa Hinchliffe, William H. Mischo, Michael A. Norman

November 8, 2014

Collections Context:a distributed, heterogeneous information environment

• Annual collection budget of $15+ million• 200+ specialized article and report databases• 65K online journals• Online is, in many cases, the exclusive access point• Have a legacy online catalog (common search over

metadata, including holdings of books, journals, media, microforms, etc.)

2

AssessmentIn the book “How to Measure Anything,” Douglas Hubbard encourages working from four assessment assumptions:

1) your problem is not as unique as you think, 2) you have more data than you think, 3) you need less data than you think, and 4) an adequate amount of new data is more accessible than you think.

3

Evidence-based framework

Taking these as are our starting points, the Library’s Discovery and Delivery Study Team at the University of Illinois at Urbana-Champaign has formulated an evidence-based framework for decision-making relative to future system development.

4

Assessing Reports Over the Past Decade:• Library’s reports on implementing the

WebFeat federated search system (2005-2006),

• developing and deploying Easy Search, a locally-developed and supported broadcast search, search assistance, and recommender system (2006-present), and

• piloting Primo (2011-2014)

5

For this presentation, we will particularly concentrate on:

Analysis of user search behavior, search types, and use of assistance features through Easy Search logs (2010-present) with particular examination of Easy Search to Primo transactions and database coverage of user searches.

6

A Review of Library Reports - We Value:• Transparency• Predictability• Customizability• Co-Development Opportunities

7

A Review of a Decade of User Surveys - Users Want:

• Seamless Digital Delivery• Coherent Discovery Pathways• As Simple as Possible but Not Simplistic• Not Everything, But “My Everything”• Transparency• Independence

8

University of Illinois Library Gateway

Gateway portal introduced in September 2007, powered by Easy Search federated search system:

oRecommender and discovery systemo Employs Search Assistance mechanismsoHelps with search strategy modification and

navigationo Takes users into native interfaces at point of

completed searchoWrites out custom transaction logs

UIUC Library Gateway

Current Easy Search Results Page

BETA: Bento Easy Search Top of Screen Results

BETA: Bento Easy Search Bottom of Screen Results:

Exploring Discovery via Primo (2011-2014)

• Role of the web-scale system in the Library’s main gateway• Relationship between a web-scale system and Easy Search• Effectiveness of full-text search and relevancy ranking algorithms• Relationship between a web-scale aggregated central index and the

specialty A & I Services• Effectiveness of vendor databases (e.g., EBSCO and Scopus) when

integrated into Primo• Value of blended displays • Instructional issues connected with web-scale systems (Note – See

LOEX paper by Hinchliffe and Avery)

14

Previous Studies Have Included Custom Transaction Log Data Analysis

• Conducted studies of user behavior in the Easy Search University of Illinois Gateway

• Looked at total of 1.4 million searches and 1.5 million clickthroughs Fall 2010 and Spring 2011 semesters

• Looked at 1 million searches over 10 month period May 2013 to March 2014

• Supported by two grants: NSF and IMLS but continual effort

Transaction Logs• We have used transaction logs in the

development of context-specific and adaptive search assistance for users

• Transaction log analysis is complemented with user interviews and surveys

• Relational database of two tables with analysis using SQL queries

Looking at Primo as target in Easy Search• Between 05-01-2014 and 06-21-2014,

users clicked on one of the Primo article, Primo book, or Primo All search result links 15,068 times.

• This is out of a total of 147,326 user result clickthroughs during that 05-01 to 06-21 time period.

17

Primo clickthrough study• Between 05-01-2014 and 06-21-2014,

users clicked on one of the Primo article, Primo book, or Primo All search result links 15,068 times.

• This is out of a total of 147,326 user result clickthroughs during that 05-01 to 06-21 time period.

18

Primo Clickthrough study (More)• We looked at a sample of 473 of the 15,068

searches, categorizing searches by known-item or topical then looking at the resulting search matches from a number of targets:

• Included all the Primo results and the ACSE, Scopus, Web of Knowledge, CrossRef, WorldCat Discovery, Vufind, IShare, Summon, Google, and Google Scholar results within Easy Search.

• The success of the searches in these results targets were coded and stored in a relational database.

19

Out of the sample of 473 results:• There were 245 known-item searches in the

sample (51.8%) and 228 topical searches (48.2%).

• Of the 245 known-item searches, 159 were judged successful in that they brought back the expected results in the first page or as the top item (64.8%)

• Of the 228 topical searches, 194 were judged to have brought back some relevant results.

20

Finding Known Items:

Looking at the 159 ‘successful’ Primo known-item searches, we found that all of them were also

successfully retrieved in Scopus, EBSCO, CrossRef, WorldCat Discovery, Web of Knowledge, Google,

or Google Discovery searches.

21

Notes from analyses of Easy Search results:

• Some of the successful Primo searches were full citation searches of the form:Hemenway, D. (2010). Why We don't Spend Enough on Public Health. New England Journal of Medicine, 362(18), 1657

• We found that CrossRef and WorldCat Discovery were very good at retrieving successful search results for these types of searches.

• In general, we found that Scopus and EBSCO overlapped extensively with Primo and that CrossRef, WorldCat Discovery and Google Scholar were very important in and useful in matching the successful Primo results.

22

Overarching Issues • Understanding and supporting user search

behaviors is important • The effectiveness of full-text search and

relevancy ranking is under question• Some modules (e.g. OPAC) may be better

than central index search/discovery • Important to distinguish between discovery

features and known-item access mechanisms

23

Next StepsInformed by evidence, the Discovery and Delivery Study Team:• Articulate principles for a discovery/delivery strategy.• Identify discovery system requirements and value proposition

for central index.• Evaluate Easy Search sustainability.• Recommend strategy to Library’s Content Access Policy and

Technology Committee.

24