SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

28
By Keith Cortis & Charlie Abela

description

This paper was presented in the 1st Workshop on Personal Semantic Data (PSD 2010: http://semanticweb.org/wiki/Personal_Semantic_Data) at EKAW 2010 (http://ekaw2010.inesc-id.pt/) Conference on Knowledge Engineering and Knowledge Management by the Masses in Lisbon, Portugal on 11 October 2010. The full paper can be found on: http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-629/psd2010_paper2.pdf

Transcript of SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

Page 1: SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

By Keith Cortis & Charlie Abela

Page 2: SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

Instant Messaging (IM) - communication

in real time were messages are transferred

in a seemingly peer-to-peer manner

Increase in the fragmentation of personal

information

Several tools developed to aid users in the

management of their personal information

space

Page 3: SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

Vision behind Semantic Desktop (SD) -

tackling the difficulties when managing

personal information

Research - towards this area & extraction

of semantics from chat conversations

Improve PIM by linking the different

content found on the desktop with the

extracted semantics

Page 4: SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

Exploiting and extending NEPOMUK’s

Social Semantic Desktop framework with a

semantic chat client component, ‘SemChat’

Extraction and annotation of important

concepts from a chat conversation

Storage of any concepts that were not

annotated, for reference in future SemChat

sessions

Page 5: SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

Semantic search for specific concepts (incl.

events) in different ways, for example by

date

Ability to use this plug-in from different

chat clients achievable by using a client that

can handle multiple protocols

Page 6: SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

General Architecture

Buy SmartDraw!- purchased copies print this

document without a watermark .

Visit www.smartdraw.com or call 1-800-768-3729.

Page 7: SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

NEPOMUK – allows user to manage alldata found on her desktop and to link thedocuments within the PIMO

Spark IM – XMPP chat client that satisfiedour needs

Spark IM – enhanced with multiprotocolfunctionality via the availability of anXMPP server

Page 8: SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

End of chat session - non-intrusive system

Cost of interruptions varies on average

between 10-15 minutes before users return

their focus to the disrupted task

Page 9: SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

Context menus used to represent operations

that a user can do, for each extracted concept

Page 10: SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)
Page 11: SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

JAPE rules implemented – to recognize

possible events within a chat conversation

using regular expressions in annotations

Page 12: SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

Rule: EventRule

(

{ Lookup.majorType==event_trigger }

):eventTrigger

-->

{

AnnotationSet matchedAnns= (AnnotationSet) bindings.get("eventTrigger");

FeatureMap newFeatures= Factory.newFeatureMap();

newFeatures.put("rule","EventRule");

outputAS.add(matchedAnns.firstNode(),matchedAnns.lastNode(),

"EventTrigger",newFeatures);

}

Page 13: SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

Title and prospective date of the extracted

event can be edited by the user

Annotated event will automatically be saved

within Spark’s Task List

Page 14: SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

User can filter out a search by several criteria for

example by date

Page 15: SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

No formal evaluation was performed on

any of the semantic chat clients’ projects

that we considered in the related works

section

A session was organized were 8 users tried

out SemChat

6-12 participants are enough to test the

usability of a system (Dumas and Redish)

Page 16: SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

Features of extracting concepts from chat

conversations – proved as a popular choice

Semantic search feature proved to be less

popular with several users

Majority of users experienced the

extraction of concepts and/or events from

their chat conversation

Page 17: SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

All extracted concepts/events annotated

by users were successfully stored in the

PIMO and Task List respectively

In some cases important concepts flagged

within a conversation were not extracted

Problem – XtraK4Me selects most

important key phrases ordered by

occurrence rate

Page 18: SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

Problem addressed by improving XtraK4Me

or possibly using a better key phrase extractor

Limitation – some events not extracted since

they didn’t conform to the structure that

SemChat was implemented to recognize

Possible solution – further extend ANNIE

NER to recognize all possible types of events

that can be present within a chat conversation

Page 19: SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

Context-aware chat program

Tries to solve semantic conflicts which

occur between chatting users through the

tagging of ambiguous chat messages

Solves part of this problem and is a step

forward towards eliminating semantic

conflicts which occur in chat sessions

Page 20: SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

Morphological analysis used to extract

proper nouns from the dialogue text

Online images and articles from Wikipedia

related to the extracted nouns are

simultaneously displayed alongside the

dialogue text

Helps in reducing the elements of

ambiguity like searching

Page 21: SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

Identify and improve problems that IM

systems encounter moving towards the

Networked Semantic Desktop

Chat window offers a taxonomy panel

where annotation of messages is permitted

whilst a user is chatting

Semantic Querying - search of messages

wanted by specifying a particular attribute

Page 22: SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

System uses existing email transport

technology

Is integrated with NEPOMUK

Handles and keeps track of action items

within email messages

Extracts tasks and appointments found

within email messages which are then

added to the email client’s scheduler

Page 23: SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

Prototype system

Automatically identifies action items (tasks)

in email messages

Presents user with a task-focused summary

of a message

User can add action items to their “to do”

list

Page 24: SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

Integration of SemChat with popular

applications such as a an email client like

Thunderbird

Extracted events would be logged

automatically into the client’s event scheduler

Extend ANNIE NER through JAPE so that

other entities could be extracted from

conversations such as: emails, products, etc.

Page 25: SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

Semantic search feature – further optimize

the searching process

Semantic search feature – further enhanced

to display part of chat transcript satisfying

the search criteria

Semantic annotations generated by

SemChat – quantitatively evaluated in the

future

Page 26: SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

Investigate slang language in IM into more

depth so that SemChat would be adopted

to be handle it

Ex. : “mt b4 lunch @11.30am nxt tue”

We can further extend ANNIE NER with

JAPE to be able to recognize such an event

‘mt b4’ as being ‘meet before’ and ‘nxt tue’

as being ‘next Tuesday’

Page 27: SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

We have presented a semantic chat component in

SemChat which was integrated with a SSD

application – NEPOMUK

SemChat contributes further to area of PIM

through the integration of concepts in the user’s

PIMO and the integration of events within an

events scheduler

SemChat also reflects the research being done in

the area of the SD in relation to Semantic Chat

Page 28: SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

Thank you for your attention !

Any Questions?