Extracting and Utilizing Social Networks from Log Files of Shared Workspaces

22
Copyright 2009 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute www.deri.i e Extracting and Utilizing Social Networks from Log Files of Shared Workspaces Peyman Nasirifard, Vassilios Peristeras, Conor Hayes and Stefan Decker 10th IFIP Working Conference on VIRTUAL ENTERPRISES Thessaloniki, Greece, 7-9 October 2009

description

Extracting and Utilizing Social Networks from Log Files of Shared Workspaces. Peyman Nasirifard, Vassilios Peristeras, Conor Hayes and Stefan Decker 10th IFIP Working Conference on VIRTUAL ENTERPRISES Thessaloniki, Greece, 7-9 October 2009. Outline. Introduction and Problem Definition - PowerPoint PPT Presentation

Transcript of Extracting and Utilizing Social Networks from Log Files of Shared Workspaces

Page 1: Extracting and Utilizing Social Networks from Log Files of Shared Workspaces

Copyright 2009 Digital Enterprise Research Institute. All rights reserved.

Digital Enterprise Research Institute www.deri.ie

Extracting and Utilizing Social Networks from Log Files of Shared Workspaces

Peyman Nasirifard, Vassilios Peristeras, Conor Hayes and Stefan Decker

10th IFIP Working Conference on VIRTUAL ENTERPRISESThessaloniki, Greece, 7-9 October 2009

Page 2: Extracting and Utilizing Social Networks from Log Files of Shared Workspaces

Digital Enterprise Research Institute www.deri.ie

Outline

Introduction and Problem Definition Object-centric social network for extracting

expertise User-centric social network for calculating the

coperation index Prototypes

Expert Finder Holmes

Evaluation Conclusion Q and A

Page 3: Extracting and Utilizing Social Networks from Log Files of Shared Workspaces

Digital Enterprise Research Institute www.deri.ie

Introduction and Problem Definition

Online Shared workspaces provide various services for online collaboration BSCW, SharePoint

Difficult to find people with appropriate expertise in intra- and inter-organizations settings People do not update their profiles regularly

Difficult to spot „who works with whom“ or „who the senior within a community is“ People do not maintain their social networks frequently

Page 4: Extracting and Utilizing Social Networks from Log Files of Shared Workspaces

Digital Enterprise Research Institute www.deri.ie

Problem Definition

To find people with specific expertise To understand who works with whom and to what

extend

Page 5: Extracting and Utilizing Social Networks from Log Files of Shared Workspaces

Digital Enterprise Research Institute www.deri.ie

Our approach

We use: Log files from CWEs Social Network Analysis Semantic technologies (RDF) to represent the

extracted Social Network

Page 6: Extracting and Utilizing Social Networks from Log Files of Shared Workspaces

Digital Enterprise Research Institute www.deri.ie

Social Network Analyis

Social Network Analysis has a lot of potential

Overt and Latent social networks exist among professionals

Online social networks can be divided into two main types Object-centric (e.g., based on videos, music) User-centric

We use both types in our work We use object-centric SN for extracing expertise We use user-centric SN for calculating cooperation index

– Cooperation index: an index that determines how close two people work together

Page 7: Extracting and Utilizing Social Networks from Log Files of Shared Workspaces

Digital Enterprise Research Institute www.deri.ie

Log files

Log files of shared workspaces contain rich information and can be further analyzed

A log record contains at minimum Subject (e.g., user), Object (e.g., document) and Action/Verb (e.g., read, revise) Person with ID 123 revised

the document with ID 456

We use these three elements togenerate RDF triples for processing

Page 8: Extracting and Utilizing Social Networks from Log Files of Shared Workspaces

Digital Enterprise Research Institute www.deri.ie

Object-centric Social Networks

for

extracing expertise

Page 9: Extracting and Utilizing Social Networks from Log Files of Shared Workspaces

Digital Enterprise Research Institute www.deri.ie

Finding Experts

First step: Key-phrase Extraction Documents are analysed based on NLP techniques to

identify phrases that occur frequently

Second step: Log File Analysis To identify the documents a user interacts with and how

Third step: Assigning Expertise A user is expert in topic X, if s/he created or revised a

document that contains topic X. A user is familiar with topic Y, if s/he just read a

document that contains topic Y.

Page 10: Extracting and Utilizing Social Networks from Log Files of Shared Workspaces

Digital Enterprise Research Institute www.deri.ie

Overall Approach

Page 11: Extracting and Utilizing Social Networks from Log Files of Shared Workspaces

Digital Enterprise Research Institute www.deri.ie

User-centric SN for calculating cooperation index

Page 12: Extracting and Utilizing Social Networks from Log Files of Shared Workspaces

Digital Enterprise Research Institute www.deri.ie

From Object-centric to User-centric

Action

Relationship

Page 13: Extracting and Utilizing Social Networks from Log Files of Shared Workspaces

Digital Enterprise Research Institute www.deri.ie

Assigning weights to social networks

First step: Build user-centric social network Previous slide Depth is also considered (e.g., Depth one means just

one document connects two persons)

Second step: Assign weights to relationships User-defined weights with default values (e.g. Read-

Read is low-weighted relationship, create-create high-weighted)

Third step: Calculate cooperation index Sum up the weights

Page 14: Extracting and Utilizing Social Networks from Log Files of Shared Workspaces

Digital Enterprise Research Institute www.deri.ie

Overall Approach

Page 15: Extracting and Utilizing Social Networks from Log Files of Shared Workspaces

Digital Enterprise Research Institute www.deri.ie

Prototypes

Expert Finder http://purl.oclc.org/projects/expertui

Holmes (Cooperation Index calculator) http://purl.oclc.org/projects/holmes

The prototypes are SOA-based The prototypes use the BSCW shared workspace The prototypes use log files of BSCW and in particular the

Ecospace project in the period of three years– Around 183 users extracted from log file and some thousands of

events

Expert Finder uses around 50 deliverables of Ecospace project

Page 16: Extracting and Utilizing Social Networks from Log Files of Shared Workspaces

Digital Enterprise Research Institute www.deri.ie

Snapshot: Expert Finder

Page 17: Extracting and Utilizing Social Networks from Log Files of Shared Workspaces

Digital Enterprise Research Institute www.deri.ie

Snapshot: Holmes

Page 18: Extracting and Utilizing Social Networks from Log Files of Shared Workspaces

Digital Enterprise Research Institute www.deri.ie

Evaluation with 12 participants

We asked people to take a look at their cooperation indices All participants confirmed that the presented results

were relevant to them Currently, we considered four main document events

(i.e., Create, Revise, Delete, and Read) and only relationships at a depth of one. These events can be simply extended to cover more document events as well as deeper depths.

– Combining events and assigning weights to them can bring overhead for users.

In a more complex model for calculating Cooperation Indices, different weights can be posed to documents based on their importance for the collaboration process.

Page 19: Extracting and Utilizing Social Networks from Log Files of Shared Workspaces

Digital Enterprise Research Institute www.deri.ie

Evaluation with 12 participants

Issue Solution

Meaningless expertise The confidence values (provided by NLP package) were used as a threshold to identify the phrases that have a higher probability of being a meaningful key-phrase. The key phrases were filtered accordingly.

Organization expertise profile An expertise profile may be built for an organization by unifying the expertise of all members of that organization.

Similar phrases Some phrases were conceptually the same, but reported several times. One partial solution to this problem could be using WordNet to infer the semantics of the terms and merge relevant terms.

Irrelevant expertise Version history of the shared workspace may be utilized to infer the exact contribution of a user (e.g. by using diff)

Page 20: Extracting and Utilizing Social Networks from Log Files of Shared Workspaces

Digital Enterprise Research Institute www.deri.ie

Tools and technology overview

Social Network Analysis Log files from CWEs NLP techniques for Phrase Extraction RDF for representing object-centric and user-

centric Social Networks Web Services for exposing functionalities

Page 21: Extracting and Utilizing Social Networks from Log Files of Shared Workspaces

Digital Enterprise Research Institute www.deri.ie

Conclusion and Future Work

We presented our approach for extracting expertise from online shared workspaces

We also presented our approach for calculating an index that determines how close two people worked together in the past

Addressing the points (and shortages) mentioned in the evaluation is one of our future directions

Using temporal aspects of log file is another future directions Calculating cooperation index in a period of time

Page 22: Extracting and Utilizing Social Networks from Log Files of Shared Workspaces

Digital Enterprise Research Institute www.deri.ie

Thank You!

Q and A