Extracting and Utilizing Social Networks from Log Files of Shared Workspaces
description
Transcript of Extracting and Utilizing Social Networks from Log Files of Shared Workspaces
Copyright 2009 Digital Enterprise Research Institute. All rights reserved.
Digital Enterprise Research Institute www.deri.ie
Extracting and Utilizing Social Networks from Log Files of Shared Workspaces
Peyman Nasirifard, Vassilios Peristeras, Conor Hayes and Stefan Decker
10th IFIP Working Conference on VIRTUAL ENTERPRISESThessaloniki, Greece, 7-9 October 2009
Digital Enterprise Research Institute www.deri.ie
Outline
Introduction and Problem Definition Object-centric social network for extracting
expertise User-centric social network for calculating the
coperation index Prototypes
Expert Finder Holmes
Evaluation Conclusion Q and A
Digital Enterprise Research Institute www.deri.ie
Introduction and Problem Definition
Online Shared workspaces provide various services for online collaboration BSCW, SharePoint
Difficult to find people with appropriate expertise in intra- and inter-organizations settings People do not update their profiles regularly
Difficult to spot „who works with whom“ or „who the senior within a community is“ People do not maintain their social networks frequently
Digital Enterprise Research Institute www.deri.ie
Problem Definition
To find people with specific expertise To understand who works with whom and to what
extend
Digital Enterprise Research Institute www.deri.ie
Our approach
We use: Log files from CWEs Social Network Analysis Semantic technologies (RDF) to represent the
extracted Social Network
Digital Enterprise Research Institute www.deri.ie
Social Network Analyis
Social Network Analysis has a lot of potential
Overt and Latent social networks exist among professionals
Online social networks can be divided into two main types Object-centric (e.g., based on videos, music) User-centric
We use both types in our work We use object-centric SN for extracing expertise We use user-centric SN for calculating cooperation index
– Cooperation index: an index that determines how close two people work together
Digital Enterprise Research Institute www.deri.ie
Log files
Log files of shared workspaces contain rich information and can be further analyzed
A log record contains at minimum Subject (e.g., user), Object (e.g., document) and Action/Verb (e.g., read, revise) Person with ID 123 revised
the document with ID 456
We use these three elements togenerate RDF triples for processing
Digital Enterprise Research Institute www.deri.ie
Object-centric Social Networks
for
extracing expertise
Digital Enterprise Research Institute www.deri.ie
Finding Experts
First step: Key-phrase Extraction Documents are analysed based on NLP techniques to
identify phrases that occur frequently
Second step: Log File Analysis To identify the documents a user interacts with and how
Third step: Assigning Expertise A user is expert in topic X, if s/he created or revised a
document that contains topic X. A user is familiar with topic Y, if s/he just read a
document that contains topic Y.
Digital Enterprise Research Institute www.deri.ie
Overall Approach
Digital Enterprise Research Institute www.deri.ie
User-centric SN for calculating cooperation index
Digital Enterprise Research Institute www.deri.ie
From Object-centric to User-centric
Action
Relationship
Digital Enterprise Research Institute www.deri.ie
Assigning weights to social networks
First step: Build user-centric social network Previous slide Depth is also considered (e.g., Depth one means just
one document connects two persons)
Second step: Assign weights to relationships User-defined weights with default values (e.g. Read-
Read is low-weighted relationship, create-create high-weighted)
Third step: Calculate cooperation index Sum up the weights
Digital Enterprise Research Institute www.deri.ie
Overall Approach
Digital Enterprise Research Institute www.deri.ie
Prototypes
Expert Finder http://purl.oclc.org/projects/expertui
Holmes (Cooperation Index calculator) http://purl.oclc.org/projects/holmes
The prototypes are SOA-based The prototypes use the BSCW shared workspace The prototypes use log files of BSCW and in particular the
Ecospace project in the period of three years– Around 183 users extracted from log file and some thousands of
events
Expert Finder uses around 50 deliverables of Ecospace project
Digital Enterprise Research Institute www.deri.ie
Snapshot: Expert Finder
Digital Enterprise Research Institute www.deri.ie
Snapshot: Holmes
Digital Enterprise Research Institute www.deri.ie
Evaluation with 12 participants
We asked people to take a look at their cooperation indices All participants confirmed that the presented results
were relevant to them Currently, we considered four main document events
(i.e., Create, Revise, Delete, and Read) and only relationships at a depth of one. These events can be simply extended to cover more document events as well as deeper depths.
– Combining events and assigning weights to them can bring overhead for users.
In a more complex model for calculating Cooperation Indices, different weights can be posed to documents based on their importance for the collaboration process.
Digital Enterprise Research Institute www.deri.ie
Evaluation with 12 participants
Issue Solution
Meaningless expertise The confidence values (provided by NLP package) were used as a threshold to identify the phrases that have a higher probability of being a meaningful key-phrase. The key phrases were filtered accordingly.
Organization expertise profile An expertise profile may be built for an organization by unifying the expertise of all members of that organization.
Similar phrases Some phrases were conceptually the same, but reported several times. One partial solution to this problem could be using WordNet to infer the semantics of the terms and merge relevant terms.
Irrelevant expertise Version history of the shared workspace may be utilized to infer the exact contribution of a user (e.g. by using diff)
Digital Enterprise Research Institute www.deri.ie
Tools and technology overview
Social Network Analysis Log files from CWEs NLP techniques for Phrase Extraction RDF for representing object-centric and user-
centric Social Networks Web Services for exposing functionalities
Digital Enterprise Research Institute www.deri.ie
Conclusion and Future Work
We presented our approach for extracting expertise from online shared workspaces
We also presented our approach for calculating an index that determines how close two people worked together in the past
Addressing the points (and shortages) mentioned in the evaluation is one of our future directions
Using temporal aspects of log file is another future directions Calculating cooperation index in a period of time
Digital Enterprise Research Institute www.deri.ie
Thank You!
Q and A