Design for Interaction
-
Upload
daniel-tunkelang -
Category
Technology
-
view
2.744 -
download
4
description
Transcript of Design for Interaction
© 2009 Endeca Technologies, Inc. All rights reserved.
design for interaction
Daniel TunkelangChief Scientist, Endeca
© 2009 Endeca Technologies, Inc. All rights reserved.2
about me
Organizing SIGIR ’09 Industry Track in Boston on July 22nd!
© 2009 Endeca Technologies, Inc. All rights reserved.3
about endeca
250M+ end users
per month
250M+ end users
per month600+ customers
$100M+ annual sales
leading provider ofsearch applications
© 2009 Endeca Technologies, Inc. All rights reserved.4
what i hope you learn from this talk
the db and ir perspectives have a common thread
convergence may be upon us
but we need interaction to make it work
© 2009 Endeca Technologies, Inc. All rights reserved.5
overview
don't put all your eggs in one basket
design for interaction
human-computer information retrieval
© 2009 Endeca Technologies, Inc. All rights reserved.6
don’t put all your eggs in one basket
Still Life with Basket and Broken Eggs by Michael Edwards, 2008
© 2009 Endeca Technologies, Inc. All rights reserved.7
the db approach: perfection in, perfection out
http://www.storeitfoodsblog.com/category/food-preparation/meat-grinder/
© 2009 Endeca Technologies, Inc. All rights reserved.8
db usability researchers recognize the pain
© 2009 Endeca Technologies, Inc. All rights reserved.9
sql is hard
Making Database Systems Usable[Jagadish et al., SIGMOD 2007]
• labor-intensive query construction
• lengthy query evaluation
• high query reformulation cost
__sql
© 2009 Endeca Technologies, Inc. All rights reserved.10
data sucks and users are lazy
Extracting Problems for Databaseand IR Researchers[Naughton, Spring 2008 North East DB/IR Day]
• real data is– incomplete– inconsistent– incorrect
• users don’t want to learn– data schemas– structured query languages we’re not gonna take it!
© 2009 Endeca Technologies, Inc. All rights reserved.11
the ir way: don’t worry, be happy
http://adsoftheworld.com/media/print/mcdonalds_burger_mysteries
© 2009 Endeca Technologies, Inc. All rights reserved.12
ir for db people: what would google do?
information Need query select from results
rank using IR model
USER:
SYSTEM:tf-idf PageRank
© 2009 Endeca Technologies, Inc. All rights reserved.13
assumptions of relevance-centric ir approach
• self-awareness
• self-expression
• model knows best
• answer is a document
• one-shot query
© 2009 Endeca Technologies, Inc. All rights reserved.14
life is not a batch
• db approach expects too much of user• ir approach expects too much of system
both approaches act as if it allcomes down to a single query
is that your final answer question?
© 2009 Endeca Technologies, Inc. All rights reserved.15
design for interaction
The Future of Social Interaction by Jim Stoten
© 2009 Endeca Technologies, Inc. All rights reserved.16
changes assumptions about what to optimize
recall
pre
cis
ion
complexity relevance
communication
© 2009 Endeca Technologies, Inc. All rights reserved.17
how do we optimize communication?
transparency
control
guidance
© 2009 Endeca Technologies, Inc. All rights reserved.18
ir offers a black box
ca c'est la caisse. le mouton que tu veux est dedans.
© 2009 Endeca Technologies, Inc. All rights reserved.19
db / set retrieval offers 2 out of 3
transparency
control
guidance
© 2009 Endeca Technologies, Inc. All rights reserved.20
but we need it all!
• set retrieval is a failure in the ir world– though quite successful in the db world!
• but ranked retrieval is inherently crippled– no transparency, control, or guidance!
how do we optimize for communication?
© 2009 Endeca Technologies, Inc. All rights reserved.21
human-computer information retrieval
• don’t just guess the user’s intent• increase user responsibility and control• require and reward human intellectual effort
“Toward Human-Computer Information Retrieval”
Gary Marchionini
© 2009 Endeca Technologies, Inc. All rights reserved.22
great idea
how?
© 2009 Endeca Technologies, Inc. All rights reserved.23
treat query construction as a process
A Case for Interaction[Koenemann and Belkin, 1996]
• used term feedback to improve alerting queries
• users select from suggested terms
• 17 – 34% improvement in precision @ 30
• users liked the feedback interface
© 2009 Endeca Technologies, Inc. All rights reserved.24
expose the facets of semistructured content
© 2009 Endeca Technologies, Inc. All rights reserved.25
success in the lab and the field
• favored in user studies by Marti Hearst– http://flamenco.berkeley.edu/
• ubiquitous in ecommerce– amazon.com– eBay– endeca powers 42 of top 100 online retailers
• taking over media, libraries, enterprise, etc.
© 2009 Endeca Technologies, Inc. All rights reserved.26
even a few db folks have drunk the kool-aid
DataGuides[Goldman and Widom, VLDB 1997]
• user-friendly schema summaries
Magnet[Sinha and Karger, SIGMOD 2005]
• navigation and refinement options
common theme: semistructured
© 2009 Endeca Technologies, Inc. All rights reserved.27
what is semistructured data?
• one universe
• self-describing
• blends data / meta-data
© 2009 Endeca Technologies, Inc. All rights reserved.28
data modeling flexibility
• no a-priori schema– integrated sources without up-front schema design
• richer modeling capabilities tame data complexity– hierarchy, multi-valued fields, sparse fields
• schema flexibility eases schema evolution– new entity types, new data source
Databases Content ManagementERP
Groupware and Collaboration
WWW Internet
SOA, ESB,Web ServiceFile Systems
© 2009 Endeca Technologies, Inc. All rights reserved.29
semantically direct queries
<shirt><sku>1234</sku><sleeve>Long</sleeve><desc>Classic end-on-end shirt</desc><price>39.99</price><salePrice>29.99</salePrice><color>Blue</color><color>Yellow</color><color>White</color>...
</shirt> <trousers><sku>1579</sku><price>59.99</price><color>Khaki</color>...
</trousers>
which on-sale itemsare available in blue?
<buyingGuide><title>Selecting the right ski coat for you.</title><file>skiguide.pdf</file><keyword>ski</keyword><keyword>coat</keyword>...
</buyingGuide>
which attributescharacterize on-saleblue items?
price, sleeve,color, salePrice,brand, fabric, …
© 2009 Endeca Technologies, Inc. All rights reserved.30
but let’s make this concrete
Uh oh, I’m presenting atSIGMOD! Better find a good
book about databases!
© 2009 Endeca Technologies, Inc. All rights reserved.31
quick, to the goog-mobile!
not quite…
© 2009 Endeca Technologies, Inc. All rights reserved.32
i know, i’ll go to the library!
#%@$!
© 2009 Endeca Technologies, Inc. All rights reserved.33
let’s try a little hcir…
© 2009 Endeca Technologies, Inc. All rights reserved.34
hcir works for news too
© 2009 Endeca Technologies, Inc. All rights reserved.35
life in a semistructured world
• search is a great starting point– users can’t / won’t initiate structured queries
• ranked lists are an inadequate ending point– search queries are lossy projections of intent
• hcir leads users down a garden path to structure
© 2009 Endeca Technologies, Inc. All rights reserved.36
lots of trade-offs
“everything should be made as simple as possible, but no simpler”
“speed of thought” vs. “going nowhere quickly”
“to err is human, but to really foul things up requires a computer”
simple interfaces don’talways yield satisfaction
© 2009 Endeca Technologies, Inc. All rights reserved.37
users want the triumvirate
• transparency• control• guidance
transparency and control are easy
guidance requires cleverness
© 2009 Endeca Technologies, Inc. All rights reserved.38
in closing
all of us want to help people access information
the best help is to help them help themselves
design for interaction thoughtransparency, control, guidance
© 2009 Endeca Technologies, Inc. All rights reserved.39
thank you…and come to SIGIR!
communication 1.0email: [email protected]
communication 2.0blog: http://thenoisychannel.com
twitter: http://twitter.com/dtunkelang
SIGIR: July 19-23 in Boston Industry Track on July 22nd!