Individualized Knowledge Access
description
Transcript of Individualized Knowledge Access
Individualized Knowledge Access
David KargerLynn Andrea SteinMark AckermanRalph Swick
Information AccessA key task in Oxygen: help people
manage and retrieve informationThree overlapping projects:
Haystack: information storage and retrieval application clients
Semantic Web: next-generation metadata Volt: collaborative access
Presentation OverviewMotivation
Information access behavior and goalsSystem Design & Architecture
Data Model Interacting data and UI components
Working applications Base haystack Frontpage Volt
Motivation
Problem ScenarioI try solving problems using my data:
Information gathered personally High quality, easy for me to understand Not limited to publicly available content
My organization: Personal annotations and meta-data Choose own subject arrangement Optimize for my kind of searching
Adapts to my needs
Then Turn to a FriendLeverage
They organize information for their own use
Let them find things for me tooShared vocabulary
They know me and what I wantPersonal expertise
They know things not in any libraryTrust
Their recommendations are good
Last to Library/webAnswer usually there
But hard to find Wish: rearrange to suit my needs Wish: help from my friends in looking
LessonsIndividualized access
Best tools adapt to individual ways of organizing and seeking data
Individualized knowledge People know more than they publish That knowledge is useful to them and others
Collaborative use Right incentives lead to sharing and joint
use
HaystackIndividualized access
My data collection, organization Search tools tuned for me
Collaborate to leverage individual knowledge Access unpublished information in others’ haystacks Self interest public benefit
Lens to personalize access to the world library Rearrange presentation to suit my personal needs
ExampleInfo on probabilistic models in data mining
My haystack doesn’t know, but “probability” is in lots of email I got from Tommi Jaakola
Tommi told his haystack that “Bayesian” refers to “probability models”
Tommi has read several papers on Bayesian methods in data mining
Some are by Daphne Koller I read/liked other work by Koller My Haystack queries “Daphne Koller Bayes” on
Yahoo Tommi’s haystack can rank the results for me…
System Design
Gathering DataHaystack archives anything
Web pages browsed, email sent and received, address book, documents written
And any properties, relationships Text of object (for text search) Author, title, color, citations, quotations,
annotations, quality, last usageUsers freely add types, relationships
Semantic WebArbitrary objects,
connected by named links
No fixed schema User extensible
Sharable by any application A new “file
system”?
Doc
D. Karger
Haystack
title
author
Outstanding
quali
ty
says
HTML type
Gathering DataActive user input
Interfaces let user add data, note relationships
Mining data from prior data Plug-in services opportunistically extract data
Passive observation of user Plug-ins to other interfaces record user
actionsOther Users
Data Extraction Services
Web Observer Proxy
Triple Store
Mail Observer Proxy
Machine Learning Services
Web Viewer
Volt Viewer/ Editor
Spider
Sample Applications
Sample ApplicationsBecause everything uses the
Semantic Web constructions, a variety of application clients can share information Web Browser---data viewer FrontPage---personalized information
filter Volt---collaboration tool
Haystack via WebWeb server
interfaceBasic
operations: Insert
objects View objects Queries
Haystack via Web
Haystack via WebViewer shows one node and associated arrows
Service notices we’ve archived a directory; so archives the objects it contains (and so on…)
Haystack via WebServices detect document type, extract relevant metadata
Output can specialize by type of object
MediationHaystack can be a lens for viewing
data from the rest of the world Stored content shows what user
knows/likes Selectively spider “good” sites Filter results coming back
Compare to objects user has liked in the past Can learn over time
Example - personalized news service
News Service
News ServiceScavenges articles from your favorite news
sources Html parsing/extracting services
Over time, learns types of articles that interest you Prioritizes those for display
Content provider no longer controls viewing experience No more ads
Personalized News Service
Collaborative AccessWant to leverage others’ work in
organizing information No need to “publish” expertise Exposed automatically---without effort Self interest helps others
VoltVolt is about collaboration between
people The Haystack architecture allows easy
collaboration among individuals semantic web references to Haystack
objects Individuals share parts of their Haystack Group spaces and shared notebooks
Volt
CollaboratorsThose I interact with
Frequent mail contact Frequent visits to their home page
Those with shared content And who have same opinions about
content Collaborative filtering techniques
ReferralsExpertise search engine
Expertise Beacon
Volt Expertise BeaconsGroup spaces and shared notebooks
Create individual and group profiles Profiles can be used to find other
people Allows targeted search “Who else is working on this project?”
User controls visibility/privacy
SummaryNext generation information accessSemantic Web
provides a language and capabilities for meta-dataHaystack
teases out individual knowledge, stores it in a coherent fashion, and allows a variety of application clients to leverage
individual meta-dataVolt
turns individual knowledge into a community resource
More Info
http://haystack.lcs.mit.edu/http://www.w3c.org/2001/[email protected]@[email protected]@w3.org