Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .
-
Upload
jazmin-liddell -
Category
Documents
-
view
213 -
download
1
Transcript of Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .
![Page 1: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/1.jpg)
Contrary to popular belief, elephants do not provide good relevancy tests.
Nor do cats.
http://daisythecurlycat.blogspot.com/2009/03/elephant-mancat.html
Naomi Dushay – Stanford University Libraries 2/2011
![Page 2: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/2.jpg)
Practical Relevancy Testing
Naomi Dushay
Stanford University LibrariesCode4Lib 2011
![Page 3: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/3.jpg)
http://bacolicio.us/http://www.jesterartsillustrations.com/images/free/300/17066.jpg> Naomi Dushay – Stanford University Libraries 2/2011
![Page 4: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/4.jpg)
Naomi Dushay – Stanford University Libraries 2/2011http://bacolicio.us/http://www.jesterartsillustrations.com/images/free/300/17066.jpg>
![Page 5: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/5.jpg)
Naomi Dushay – Stanford University Libraries 2/2011http://bacolicio.us/http://www.jesterartsillustrations.com/images/free/300/92067.jpg>
Not !
![Page 6: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/6.jpg)
Naomi Dushay – Stanford University Libraries 2/2011
Solr has …
![Page 7: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/7.jpg)
Shiny Knobs! Naomi Dushay – Stanford University Libraries 2/2011
http://www.aandhbrass.co.uk/products/door_furniture/mortice_knobs.htm
Solr has …
![Page 8: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/8.jpg)
NO changes without tests!http://www.outofbodies.com/sketchbook/mamacusumano.jpg Naomi Dushay – Stanford University Libraries 2/2011
![Page 9: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/9.jpg)
How Do You Test
Search Result Relevancy?
Naomi Dushay – Stanford University Libraries 2/2011
![Page 10: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/10.jpg)
What IS
Search Result Relevancy?
Naomi Dushay – Stanford University Libraries 2/2011
![Page 11: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/11.jpg)
PRECISION
http://www.istockphoto.com/file_thumbview_approve/1457289-cherry-slot-machine.jpg
Naomi Dushay – Stanford University Libraries 2/2011
![Page 12: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/12.jpg)
PRECISION
RECALL
http://www.istockphoto.com/file_thumbview_approve/1457289-cherry-slot-machine.jpg
Naomi Dushay – Stanford University Libraries 2/2011
![Page 13: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/13.jpg)
PRECISION
RECALLCONTEX
T
http://www.istockphoto.com/file_thumbview_approve/1457289-cherry-slot-machine.jpg
$ $$ $$$ $Naomi Dushay – Stanford University Libraries 2/2011
![Page 14: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/14.jpg)
How Do You Evaluate the Relevancy of Search Results?
Naomi Dushay – Stanford University Libraries 2/2011
![Page 15: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/15.jpg)
A: Use Complaints.
Bess Sadler
Naomi Dushay – Stanford University Libraries 2/2011
![Page 16: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/16.jpg)
A: Use Complaints.
… er …
Feedback.
Naomi Dushay – Stanford University Libraries 2/2011
![Page 17: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/17.jpg)
“… A search for (no quotes):
memoirs of a physician dumas
… book with that title by dumas is the third result; I would expect it to be the first.”
Naomi Dushay – Stanford University Libraries 2/2011
![Page 18: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/18.jpg)
HOW Do You TEST
Search Result Relevancy?
Repeatable
AutomatableNaomi Dushay – Stanford University Libraries 2/2011
![Page 19: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/19.jpg)
Naomi Dushay – Stanford University Libraries 2/2011
http://discovery-grindstone.blogspot.com/
Full Stack Test:
As if:• Query entered in UI• App gets form input• App processes query• App sends processed query to Solr• Solr processes query• App processes Solr results• Results returned to UI
![Page 20: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/20.jpg)
Ruby on Rails
Naomi Dushay – Stanford University Libraries 2/2011
![Page 21: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/21.jpg)
http://cukes.info/
Automatable!
Naomi Dushay – Stanford University Libraries 2/2011
![Page 22: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/22.jpg)
Ruby on Rails
Naomi Dushay – Stanford University Libraries 2/2011
![Page 23: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/23.jpg)
“… A search for (no quotes):
memoirs of a physician dumas
… book with that title by dumas is the third result; I would expect it to be the first.”
Naomi Dushay – Stanford University Libraries 2/2011
(demo)
![Page 24: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/24.jpg)
Naomi Dushay – Stanford University Libraries 2/2011
![Page 25: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/25.jpg)
Naomi Dushay – Stanford University Libraries 2/2011
![Page 26: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/26.jpg)
Use (a copy of?) your full index for testing.
Tests assert that searches, in context, retrieve correct results.
Naomi Dushay – Stanford University Libraries 2/2011
![Page 27: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/27.jpg)
VUF-779 Student
“… [searching for] Le Rayon Vert, … Searchworks stupidly supplies results for textiles, when the French Rayon in question refers to sunshine.”
Naomi Dushay – Stanford University Libraries 2/2011
New Test!
![Page 28: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/28.jpg)
Naomi Dushay – Stanford University Libraries 2/2011
![Page 29: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/29.jpg)
Naomi Dushay – Stanford University Libraries 2/2011
![Page 30: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/30.jpg)
Naomi Dushay – Stanford University Libraries 2/2011
![Page 31: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/31.jpg)
Naomi Dushay – Stanford University Libraries 2/2011
![Page 32: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/32.jpg)
http://cukes.info/
rspecrspec-railswebrat
Naomi Dushay – Stanford University Libraries 2/2011
![Page 33: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/33.jpg)
cucumber:
Regular Expressions
webrat:
Faking User Input to Web Pages
Naomi Dushay – Stanford University Libraries 2/2011
![Page 34: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/34.jpg)
Step Definitions
Examples of html simulations via webrat
• pulldown:And I select "_____" from "______”
• Link:And I follow "____________”
Naomi Dushay – Stanford University Libraries 2/2011
![Page 35: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/35.jpg)
rspec:
Behavior Driven Development - a twist on unit testing:
object.should (be_something | do_something)
vs.
assertXXX(object.method)
Naomi Dushay – Stanford University Libraries 2/2011
![Page 36: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/36.jpg)
Step Definitions
Localized:
“the first 4 results should be …”“record 777 should be before record
999”
Naomi Dushay – Stanford University Libraries 2/2011
![Page 37: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/37.jpg)
Step Definitions
Localized:
Not too brittle!!
Accommodate ongoing changes to your data
Naomi Dushay – Stanford University Libraries 2/2011
![Page 38: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/38.jpg)
Step Definitions
Stanford’s step definitions:
http://discovery-grindstone.blogspot.com/
http://www.stanford.edu/~ndushay/code4lib2011/search_result_steps.rb
Naomi Dushay – Stanford University Libraries 2/2011
![Page 39: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/39.jpg)
http://cukes.info/
PHP:https://github.com/aslakhellesoy/cucumber/wiki/PHP
Naomi Dushay – Stanford University Libraries 2/2011
But I USE (toasters) ...
Google: cucumber toasters
![Page 40: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/40.jpg)
Weaknesses With This Approach
Naomi Dushay – Stanford University Libraries 2/2011
![Page 41: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/41.jpg)
1. you don’t know what you don’t know.
- unreported problems- poor recall- good precision for some items only
Naomi Dushay – Stanford University Libraries 2/2011
![Page 42: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/42.jpg)
2. some feedback is too vague.
Naomi Dushay – Stanford University Libraries 2/2011
![Page 43: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/43.jpg)
3. context experts are too …
Naomi Dushay – Stanford University Libraries 2/2011
– “Socrates shows 8 results, Searchworks 7, and there are some results in one but not the other and vice versa”
– “Great! Tell me what should be in the result set.”
– (crickets)
![Page 44: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/44.jpg)
4. positive feedback for specific searches and results is rare.
Naomi Dushay – Stanford University Libraries 2/2011
4a. positive feedback gets little attention.
![Page 45: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/45.jpg)
5. cucumber tests can be slow
Naomi Dushay – Stanford University Libraries 2/2011
![Page 46: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/46.jpg)
Shiny Knobs!
Naomi Dushay – Stanford University Libraries 2/2011
http://www.aandhbrass.co.uk/products/door_furniture/mortice_knobs.htm
![Page 47: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/47.jpg)
Naomi Dushay – Stanford University Libraries 2/2011
http://www.jesterartsillustrations.com/images/free/300/19089.jpg
![Page 48: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/48.jpg)
Naomi Dushay – Stanford University Libraries 2/2011
http://discovery-grindstone.blogspot.com/
Zap ‘em back with superlove!
“Back in June, you reported that a search …did not get the expected results.
[fix …]
Thank you for reporting your specific example, especially the expected ckey -- we use the information to write test case(s) that, once fixed, must pass forever more.”
![Page 49: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/49.jpg)
The Big GunsWhenever you test something ("manually") in SearchWorks, we would
like to capture your expectations as a cucumber scenario so we can run the test repeatedly and automate it. Benefits:we won't have to keep asking you to check the same things over and over. Imagine never having to perform a given test search again!
We can ensure that applying a fix for one problem won't inadvertently break something we've already fixed.
We can automate running a large suite of tests nightly so we keep checking that we haven't broken anything.
As we add specific searches and expected results against our own (meta)data corpus, we are accruing relevancy tests for our own data, based on human review of search results.
Naomi Dushay – Stanford University Libraries 2/2011
http://hubpages.com/hub/BodyBuilding-The-Science-behind-Creatine-A-Definitive-Guide
![Page 50: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/50.jpg)
KTHXBAI
Naomi Dushay – Stanford University Libraries 2/2011
http://cukes.info/
http://rspec.info/
http://www.pragprog.com/titles/achbd/the-rspec-book
http://discovery-grindstone.blogspot.com/
http://www.stanford.edu/~ndushay/code4lib2011/
![Page 51: Contrary to popular belief, elephants do not provide good relevancy tests. Nor do cats. .](https://reader036.fdocuments.in/reader036/viewer/2022070306/5518c3ad550346a61f8b56c1/html5/thumbnails/51.jpg)
Evaluating search result relevancy is difficult for any sizable amount of data, since human vetted ideal search results are essentially non-existent. This is true even for library collections, despite dedicated librarians and their familiarity with the collections. So how can we evaluate if search engine configuration changes (e.g. boosting, field analysis, search analysis settings) are an improvement? How can we ensure the results for query A don’t degrade while we try to improve results for query B? Why yes, Virginia, automatable tests are the answer. This talk will show you how you can easily write these tests from your hidden goldmine of human vetted relevancy rankings.