“More than Meets the Eye” - Analyzing the Success of User Queries in Oria

42
“More than Meets the Eye” Analyzing the Success of User Queries in Oria Hugo Huurdeman, Mikaela Aamodt, Dan Michael Heggø University of Oslo Library VIRAK conference, 2017-06-13

Transcript of “More than Meets the Eye” - Analyzing the Success of User Queries in Oria

“More than Meets the Eye”Analyzing the Success of

User Queries in Oria

Hugo Huurdeman, Mikaela Aamodt, Dan Michael Heggø

University of Oslo LibraryVIRAK conference, 2017-06-13

1. Introduction

• More insights are needed into library search issues, what goes right and what goes wrong

• Opportunity: analyze data gathered at UiO during the last two years

Research Questions1. Which insights can we gain from classifying user

queries within Oria by their popularity, specificity and intended target resources?

2. To what extent are the most popular user queries successful?

3. What underlying reasons for unsuccessful queries can be determined?

2. Data: Primo Analytics

• Actions• Device usage• Facets• Popular Searches (monthly)

• Timeframes: Jan-Jun 2015; Nov-Dec 2015; Jan-Sep 2016

• Sessions• Zero result queries (daily)

• Aug 7 2015-Sep 30 2016

2. Processing & Analysis• Basic normalization of queries

• "stanislav andreski" -- stanislav andreski• Manual annotation of

• (1) top 50 popular queries• (2) random selection of 50 "zero result" queries

• If derivable from query, determine• nature of query?• for which resource type?• curriculum-related? [in pensum lists UiO]• successful? [result on first results page]

3. Previous work

• Dominance of commercial search engines in students’ information seeking (Griffiths & Brophy, 2005)• Students adopting search behavior from

commercial search engines in library OPACs (Shiv, 2012), Willson & Given, 2010)• In particular, undergraduates just entering academia (Novotny, 2014)

4. Nature of user queries

RQ1 Which insights can we gain from classifying user queries within Oria by their popularity, specificity and intended target resources?

• Popular Searches and Zero Result Searches

4.1 Popular Searches dataset• 5,776 different

queries, 115,590 searches (UiO)

• monthly totals (500)• 2015 & 2016• 4.9% of all search actions

• Analyzing top 50• Together issued

almost 20,000 times

Top 50• What types of queries are most popular?

titles (33x, 66%)• det kvalitative

forskningsintervju• books, journals, databases

topics / titles (6x, 12%)• spesialpedagogikk• neurology

topics (4x, 8%)• cessio legis

Top 50• What resource types are sought for?

Books (50%)• det kvalitative

forskningsintervju• det norske samfunn

Journals (12%)• science• lancet

Databases (10%)• atekst• pubmed• duo

Curriculum?• Were the queries related to pensum books?

• Quite often: at least 58%!• menneskets fysiologi• det kvalitative

forskningsintervju

• Or other things (9%) :)• aftenposten;

morgenbladet• harry potter

Curriculum?• Were the queries related to pensum materials?

• Quite often: at least 58%!• menneskets fysiologi• det kvalitative

forskningsintervju

• Or other things (34%) :)• aftenposten;

morgenbladet

4.2 “Zero result” queries• 39,925 different

queries • In total 52,257

searches• Aug 2015-Sep

2016• 2.2% of all search

actions

• Annotating random sample (50)

al azm sadik «orientalisme og omvendt orientalisme» 28curr eye res 27allmenningen olaf 2412136173x 22am j ophthalmol 21821017268 21askim j. 19900317809 19cheng 19(direkte krav or direktekrav) and subrogasjon 17151582998 16142734500 16direkte krav or direktekrav 16961675616 169780549547303 16agirdag phalet 16andresen steinar elin l. boasson og geir hønneland (2012) international environmental agreements 15

Queries without resultsWhat is the nature of the “zero result” queries performed in Oria?

• Pasted reference (20x, 40%)• Browning, N. (2015). The

ethics of two-way symmetry and the dilemmas of dialogic kantianism. Journal of Media Ethics

• Title (15x, 30%)• Sentralbankens oppgaver i

dag og i fremtiden• Author (8x, 16%)

• Christopher Hotchens

Queries without results

• Book (28%)• Prcopius Secret History

• Book Chapter (12%)• Solhaug, (2006). Kapittel 13:

Strategisk læring i samfunnsfag. I

• Article (24%)• E.g., pasted references

Which resource types are not found?

Degree of pensum queries?

At least 28% of the unsuccessful queries are for pensum materials

• Fukuyama, F. (2013): What Is Governance? Governance, Vol. 26, No. 3, July 2013 (s. 347–368).

• basic immubology

5. Search Success

RQ2 To what extent are the most popular user queries successful?

Top queries1. atekst 14252. pubmed 12213. exphil 7194. det kvalitative forskningsintervju 7115. nature 6696. medical genetics 5987. direktekrav 5628. menneskets fysiologi 5409. 998903677444702000 50010. science 48311. spesialpedagogikk 47512. lancet 43813. jussens venner 36914. 991645464 364

1. atekst 14252. pubmed 12213. exphil 7194. det kvalitative forskningsintervju 7115. nature 6696. medical genetics 5987. direktekrav 5628. menneskets fysiologi 5409. 998903677444702000 50010. science 48311. spesialpedagogikk 47512. lancet 43813. jussens venner 369

1. atekst 14252. pubmed 12213. exphil 7194. det kvalitative forskningsintervju 7115. nature 6696. medical genetics 5987. direktekrav 5628. menneskets fysiologi 5409. 998903677444702000 50010. science 48311. spesialpedagogikk 47512. lancet 43813. jussens venner 369

1. atekst 14252. pubmed 12213. exphil 7194. det kvalitative forskningsintervju 7115. nature 6696. medical genetics 5987. direktekrav 5628. menneskets fysiologi 5409. 998903677444702000 50010. science 48311. spesialpedagogikk 47512. lancet 43813. jussens venner 369 Alma

Were the queries successful?

• Often, yes:• main result in first 10

results: 58%

• not (easily) found: 20%• For example:

• pubmed• nature• science

What are causes for unsuccessful queries?

• Ambiguous names• nature, science

• At the time of writing, no entries for some databases• pubmed

6. Queries without results

RQ3 What underlying reasons for zero resultqueries can be determined?

Why no results?

Pasted reference

Why no results?• Query being too specific (pasted reference, pasting quote) (22%)

• Browning, N. (2015). The ethics of two-way symmetry and the dilemmas of dialogic kantianism. Journal of Media Ethics

• Misspellings, reference mistakes (e.g. wrong year) (20%)• svennevig j.: ledelsesretorikk i nedbemanningssituasjoner 2009

• Using incorrect query syntax (2%)• "McLuhan" AND/OR "Understanding media" • 978-147996410-9

• Wrong scope (12%), wrong field (4%)

Why no results (2)• Searching for an ISBN number (specific edition not in library)

• 9780618721566

• Searching for journal titles, ISSNs, DOIs• journal of speech and hearing disorders

• Searching for course codes• MED1100

• Resource not (indexed) in Oria (16%)• Haugianerliberalistene: En analyse av haugianere som politikere

og næringslivsaktører

Suggestions for misspellings?

• 10 out of 50 queries (20%) were caused by misspellings.

• For half, a correct spelling suggestion exists• (April 17)

Do queries still return 0 results?

• In 72% of the cases, no improvement, but 28% is now resolved.

• April ’17

7. Conclusion & discussion• Library catalog containing more than

meets the eye• Even though materials are available,

they do not always show in searches

• Issues in: • query formulation• system support...

Discussion• Refine and extend search suggestions

• Enhanced spell check / query corrections• why students underacheive → 0 results • why students underachieve → 463 results

• Query suggestions and autocomplete• Could be based on previous (successful) queries

• especially recurring queries should be supported (50% of popular queries!)

• ISBN suggestions• Automatically search for the book title and author

(using information derived from ISBN number?)

Discussion• 0 results: get more helpful information

• Requesting assistance, material

• Contextual search suggestions• e.g. broaden search, simplify query, alternate

formulations

• Available scopes• suggest how many searches in other scope

Discussion• Importance of "pensum" queries

• Better integration with pensum lists

• Detecting curriculum queries• E.g. widget which suggests pensum

materials directly // referring students to UiO fagsider // etc.

• On the cataloging side:• Course codes (e.g. INF2260)

"to catalog or not to catalog"

• More support for Database queries• Atekst could be found, not pubmed

Discussion

• Monitoring searches

• Detect sudden ‘spikes’ in zero result queries• Finding errors in curriculum lists• Detecting ‘holes’ in collections (acquisition

staff), and e.g. popular books with too few copies

8. Future work• Plans for further analysis

• Comparing queries with most frequent loans• Alma Analytics

• Current analysis: many known-item queries. Also look at common exploratory queries• medicine; math; united nations; economics

• how can we support those types of queries better?

• Obtaining more data (Limits Primo Analytics)• e.g. analyzing "struggling sessions", stats location, etc

• Look at use of external databases, link resolver stats

bit.ly/the-visualisation-project

@BookNavigationFind this presentation here

“More than Meets the Eye”Analyzing the Success of

User Queries in Oria

Hugo Huurdeman, Mikaela Aamodt, Dan Michael Heggø

2017-06-13