Individualized Knowledge Access

Individualized Knowledge Access

David KargerLynn Andrea SteinMark AckermanRalph Swick

Information AccessA key task in Oxygen: help people

manage and retrieve informationThree overlapping projects:

Haystack: information storage and retrieval application clients

Semantic Web: next-generation metadata Volt: collaborative access

Presentation OverviewMotivation

Information access behavior and goalsSystem Design & Architecture

Data Model Interacting data and UI components

Working applications Base haystack Frontpage Volt

Motivation

Problem ScenarioI try solving problems using my data:

Information gathered personally High quality, easy for me to understand Not limited to publicly available content

My organization: Personal annotations and meta-data Choose own subject arrangement Optimize for my kind of searching

Adapts to my needs

Then Turn to a FriendLeverage

They organize information for their own use

Let them find things for me tooShared vocabulary

They know me and what I wantPersonal expertise

They know things not in any libraryTrust

Their recommendations are good

Last to Library/webAnswer usually there

But hard to find Wish: rearrange to suit my needs Wish: help from my friends in looking

LessonsIndividualized access

Best tools adapt to individual ways of organizing and seeking data

Individualized knowledge People know more than they publish That knowledge is useful to them and others

Collaborative use Right incentives lead to sharing and joint

use

HaystackIndividualized access

My data collection, organization Search tools tuned for me

Collaborate to leverage individual knowledge Access unpublished information in others’ haystacks Self interest public benefit

Lens to personalize access to the world library Rearrange presentation to suit my personal needs

ExampleInfo on probabilistic models in data mining

My haystack doesn’t know, but “probability” is in lots of email I got from Tommi Jaakola

Tommi told his haystack that “Bayesian” refers to “probability models”

Tommi has read several papers on Bayesian methods in data mining

Some are by Daphne Koller I read/liked other work by Koller My Haystack queries “Daphne Koller Bayes” on

Yahoo Tommi’s haystack can rank the results for me…

System Design

Gathering DataHaystack archives anything

Web pages browsed, email sent and received, address book, documents written

And any properties, relationships Text of object (for text search) Author, title, color, citations, quotations,

annotations, quality, last usageUsers freely add types, relationships

Semantic WebArbitrary objects,

connected by named links

No fixed schema User extensible

Sharable by any application A new “file

system”?

Doc

D. Karger

Haystack

title

author

Outstanding

quali

ty

says

HTML type

Gathering DataActive user input

Interfaces let user add data, note relationships

Mining data from prior data Plug-in services opportunistically extract data

Passive observation of user Plug-ins to other interfaces record user

actionsOther Users

Data Extraction Services

Web Observer Proxy

Triple Store

Mail Observer Proxy

Machine Learning Services

Web Viewer

Volt Viewer/ Editor

Spider

Sample Applications

Sample ApplicationsBecause everything uses the

Semantic Web constructions, a variety of application clients can share information Web Browser---data viewer FrontPage---personalized information

filter Volt---collaboration tool

Haystack via WebWeb server

interfaceBasic

operations: Insert

objects View objects Queries

Haystack via Web

Haystack via WebViewer shows one node and associated arrows

Service notices we’ve archived a directory; so archives the objects it contains (and so on…)

Haystack via WebServices detect document type, extract relevant metadata

Output can specialize by type of object

MediationHaystack can be a lens for viewing

data from the rest of the world Stored content shows what user

knows/likes Selectively spider “good” sites Filter results coming back

Compare to objects user has liked in the past Can learn over time

Example - personalized news service

News Service

News ServiceScavenges articles from your favorite news

sources Html parsing/extracting services

Over time, learns types of articles that interest you Prioritizes those for display

Content provider no longer controls viewing experience No more ads

Personalized News Service

Collaborative AccessWant to leverage others’ work in

organizing information No need to “publish” expertise Exposed automatically---without effort Self interest helps others

VoltVolt is about collaboration between

people The Haystack architecture allows easy

collaboration among individuals semantic web references to Haystack

objects Individuals share parts of their Haystack Group spaces and shared notebooks

CollaboratorsThose I interact with

Frequent mail contact Frequent visits to their home page

Those with shared content And who have same opinions about

content Collaborative filtering techniques

ReferralsExpertise search engine

Expertise Beacon

Volt Expertise BeaconsGroup spaces and shared notebooks

Create individual and group profiles Profiles can be used to find other

people Allows targeted search “Who else is working on this project?”

User controls visibility/privacy

SummaryNext generation information accessSemantic Web

provides a language and capabilities for meta-dataHaystack

teases out individual knowledge, stores it in a coherent fashion, and allows a variety of application clients to leverage

individual meta-dataVolt

turns individual knowledge into a community resource

More Info

http://haystack.lcs.mit.edu/http://www.w3c.org/2001/[email protected]@[email protected]@w3.org

Individualized Knowledge Access

Documents

Transcript of Individualized Knowledge Access