Mining Citizen Sensor Communities to Improve Cooperation with Organizational Actors
June 23 2015 PhD Defense
Hemant Purohit (Advisor: Prof. Amit Sheth)
Kno.e.sis, Dept. of CSE, Wright State University, USA
@hemant_pt
Outline
� Citizen Sensor Communities & Organizations
� Cooperative System Design Challenges
� Contributions � Problem 1. Conversation Classification using Offline Theories � Problem 2. Intent Classification � Problem 3. Engagement Modeling
� Applications
� Limitations & Future Work
2
@hemant_pt
Citizen Sensors: Access to Human Observations & Interactions
Uni-directional communication (TO people)
Unstructured, Unconstrained Language Data • Ambiguity • Sparsity • Diversity • Scalability
Bi-directional (BY people, TO people)
Web 2.0 media
3
@hemant_pt
Goal: Data to Decision Making
Organizational Decision Making
Noisy Citizen Sensor data
4
SOCIAL SCIENCE
• Experts on Organizations • Small-scale Data
COMPUTER SCIENCE
• Experts on Mining • Large-scale data
Scope of My Research
@hemant_pt
1. No Structured Roles 2. No Defined Tasks
ü But “GENERATE” Massive Data
1. Structured Roles 2. Defined Tasks
ü COLLECT Data ü Process, & Make Decisions
ORGANIZATIONS
Sure! How to help?
CITIZEN SENSOR COMMUNITIES
5
COOPERATIVE SYSTEM
Can you help us?
@hemant_pt
Computer-Supported Cooperative Work (CSCW) Matrix
6
[Johansen 1988,
Baecker 1995]
TIME
PLACE
@hemant_pt
Articulation
Challenges (Malone & Crowston 1990; Schmidt & Bannon 1992)
ENGAGEMENT MODELING INTENT MINING
COOPERATIVE SYSTEM
DATA PROBLEM
DESIGN PROBLEM
7
ORGANIZATIONS CITIZEN SENSOR COMMUNITIES
Awareness
Q1. Who to engage first?
Org. Actor
Q2. What are resource needs &
availabilities?
Org. Actor
@hemant_pt
Research Questions
� Can general theories of offline conversation be applied in the online context?
� Can we model intentions to inform organizational tasks using knowledge-guided features?
� Can we find reliable groups to engage by modeling collective group divergence using content-based measure?
8
@hemant_pt
Thesis: Statement
Prior knowledge, and
interplay of features of users, their content, and network
efficiently model
Intent & Engagement
for cooperation of citizen sensor communities.
Scope of Concepts
• Intent: aim of action, e.g., offering help • Engagement: involvement in activity, e.g., participating in discussion
9
@hemant_pt
Contributions
1. Operationalized computing in cooperative system design � by accommodating articulation in Intent Mining, and � enriching awareness by Engagement Modeling
2. Improved computation of online social data � by incorporating features from offline social theoretical knowledge
3. Improved performance of intent classification � by fusing top-down & bottom-up data representations
4. Improved explanation of group engagement � by modeling content divergence to complement existing structural measures
10
@hemant_pt
Data: Scope
� Social Platform: Twitter � Important bridge between citizens & organizations
� Characteristics � Users: follow/subscribe � Content: status updates (140 chars max) � Network: directed
� Platform conversation functions � Reply � Retweet � Mention
11
@hemant_pt
Outline
� Citizen Sensor Communities & Organizations
� Cooperative System Design Challenges � Awareness: tackle via Engagement Modeling � Articulation: tackle via Intent Mining
� Contributions � Problem 1. Conversation Classification using Offline Theories � Problem 2. Intent Classification � Problem 3. Engagement Modeling
� Applications
� Limitations & Future Work
12
@hemant_pt
User1. Analyzing #Conversations on Twitter. Using platform provided functions #REPLY, #RT, and #Mention. .. … …….. User2. I kinda feel one might need more than just the platform fn -- @User1 u can think #Psycholinguistics, dude!
Problem 1. Conversation Classification
� Function of Reply, Retweet, Mention reflect conversation
13
R1. Can general theories of conversation be applied in the online context?
@hemant_pt
Problem 1. Conversation Classification
� Function of Reply, Retweet, Mention reflect conversation
� Task: Given a set S of messages mi, Classify a sample {mi} for {RP, None}, {RT, None}, {MN, None} , where � Ground-truth corpuses
� RP = { mi | has_Reply_function (mi) = True } � RT = { mi | has_Retweet_function (mi) = True } � MN = { mi | has_Mention_function (mi) = True }
� None = S – {RP, RT, MN}
� Sample {mi} size = 3, based on average Reply conversation size
14
@hemant_pt
Conversation Classification: Offline Theories
� Psycholinguistics Indicators [Clark & Gibbs, 1986, Chafe 1987, etc.] � Determiners (‘the’ vs. ‘a/an’) � Dialogue Management (e.g., ‘thanks’, ’anyway’), etc.
� Drawback � Offline analysis focused on positive conversation instances
� Hypotheses � Offline theoretic features are discriminative � Such features correlate with information density
15
@hemant_pt
Conversation Classification: Feature Examples
16
CATEGORY Hj Hj SET
H1 - Determiners (the)
H3 - Subject pronouns (she, he, we, they)
H9 - Dialogue management indicators (thanks, yes, ok, sorry, hi, hello, bye, anyway, how about, so, what do you mean, please, {could, would, should, can, will} followed by pronoun)
H11 - Hedge words (kinda, sorta)
• Feature_Hj (mi) = term-frequency ( Hj-set, mi ) • Normalized • Total 14 feature categories
@hemant_pt
Conversation Classification: Results
� Dataset � Tweets from 3 Disasters, and 3 Non-Disaster events � Varying set size (3.8K – 609K), time periods
� Classifier: � Decision Tree � Evaluation: 10-fold Cross Validation � Accuracy: 62% - 78% [Lowest for {Mention,None} ] � AUC range: 0.63 - 0.84
17 Purohit, Hampton, Shalin, Sheth & Flach. In Journal of Computers in Human Behavior, 2013
@hemant_pt
Conversation Classification: Discriminative Features
� Consistent top features across classifiers � Pronouns (e.g., you, he) � Dialogue management (e.g., thanks) � Determiners (e.g., the) � Word counts
� Positively correlated with RP, RT, MN � Correlation Coefficient up to 0.69
18
@hemant_pt
Conversation Classification: Psycholinguistic Analysis
� LIWC: Tool for deeper content analysis [Pennebaker, 2001]
� Gives a measure per psychological category
� Categories of interest � Social Interaction � Sensed Experience � Communication
� Analyzed output sets in confusion matrices Ø Higher values for positive classified conversation
Ø suggests higher information for cooperative intent
19 Purohit, Hampton, Shalin, Sheth & Flach. In Journal of Computers in Human Behavior, 2013
True Positive
False Negative
False Positive
True Negative
@hemant_pt
Conversation Classification: Lessons
1. Offline theoretic features of conversations exist in the online environment Ø Can be applied for computing social data
2. Such features correlate with information density in content - Reflection of conversation for an intent
20
@hemant_pt
Outline
� Citizen Sensor Communities & Organizations
� Cooperative System Design Challenges � Awareness: tackle via Engagement Modeling � Articulation: tackle via Intent Mining
� Contributions � Problem 1. Conversation Classification using Offline Theories � Problem 2. Intent Classification � Problem 3. Engagement Modeling
� Applications
� Limitations & Future Work
21
@hemant_pt
Thesis: Statement
Prior knowledge, and
interplay of features of users, their content, and network
efficiently model
Intent & Engagement
for cooperation of citizen sensor communities.
22
@hemant_pt
Short-text Document Intent
� Intent: Aim of action
DOCUMENT INTENT
Text REDCROSS to 90999 to donate 10$ to help the victims of hurricane sandy
SEEKING HELP
Anyone know where the nearest #RedCross is? I wanna give blood today to help the victims of hurricane Sandy
OFFERING HELP
Would like to urge all citizens to make the proper preparations for Hurricane #Sandy - prep is key - http://
t.co/LyCSprbk has valuable info!
ADVISING
23
@hemant_pt
Short-text Document Intent
� Intent: Aim of action
DOCUMENT INTENT
Text REDCROSS to 90999 to donate 10$ to help the victims of hurricane sandy
SEEKING HELP
Anyone know where the nearest #RedCross is? I wanna give blood today to help the victims of hurricane Sandy
OFFERING HELP
Would like to urge all citizens to make the proper preparations for Hurricane #Sandy - prep is key - http://
t.co/LyCSprbk has valuable info!
ADVISING
24
How to identify relevant intent from ambiguous, unconstrained natural language text?
Relevant intent è Articulation of organizational tasks
(e.g., Seeking vs. Offering resources)
@hemant_pt
Intent Classification: Problem Formulation
� Given a set of user-generated text documents, identify existing intents
� Variety of interpretations
� Problem statement: a multi-class classification task
approximate f: S ! C , where C = {c1, c2 … cK}
is a set of predefined K intent classes, and S = {m1, m2 … mN}
is a set of N short text documents
Focus - Cooperation-assistive intent classes, C= {Seeking, Offering, None} 25
@hemant_pt
Intent Classification: Related Work
TEXT CLASSIFICATION TYPE
FOCUS EXAMPLE
Topic predominant subject matter
sports or entertainment
Sentiment/Emotion/Opinion
focus on present state of emotional affairs
negative or positive; happy emotion
Intent
Focus on action, hence, future state of affairs
offer to help after floods
e.g., I am going to watch the awesome Fast and Furious movie!! #Excited
26
@hemant_pt
Intent Classification: Related Work
DATA TYPE APPROACH FOCUS LIMITED APPLICABILITY
27
Formal text on Webpages/blogs
(Kröll and Strohmaier 2009, -15;
Raslan et al. 2013, -14)
Knowledge Acquisition:
via Rules, Clustering
• Lack of large corpora with proper grammatical structure
• Poor quality text hard to parse for dependencies
Commercial Reviews, marketplace
(Hollerit et al. 2013, Wu et al. 2011,
Ramanand et al. 2010, Carlos & Yalamanchi 2012, Nagarajan et al.
2009)
Classification: via Rules, Lexical template based,
Pattern
• More generalized intents (e.g., ‘help’ broader than ‘sell’)
• Patterns implicit to capture than for buying/selling
Search Queries
(Broder 2002, Downey et al. 2008,, Case 2012, Wu et al. 2010, Strohmaier & Kröll 2012)
User Profiling: Query Classification
• Lack of large query logs, click graphs
• Existence of social conversation
@hemant_pt
Intent Classification: Challenges
� Unconstrained Natural Language in small space
� Ambiguity in interpretation
� Sparsity of low ‘signal-to-noise’: Imbalanced classes � 1% signals (Seeking/Offering) in 4.9 million tweets #Sandy
� Hard-to-predict problem: � commercial intent, F-1 score 65% on Twitter [Hollerit et al. 2013]
@Zuora wants to help @Network4Good with Hurricane Relief. Text SANDY to 80888 & donate $10 to @redcross @AmeriCares & @SalvationArmyUS #help *Blue: offering intent, *Red: seeking intent
28
@hemant_pt
Intent Classification: Types & Features
29
Intent
Binary
Crisis Domain: - [Varga et al. 2013] Problem vs. Aid (Japanese) - Features: Syntactic, Noun-Verb templates, etc.
Commercial Domain: - [Hollerit et al. 2013] Buy vs. Sell intent - Features: N-grams, Part-of-Speech
Multiclass
Commercial Domain: - Not on Twitter
@hemant_pt
TOP-DOWN
Pattern Rules:
Declarative Knowledge (patterns defined for intent association)
BOTTOM-UP
Bag of N-grams Tokens: Independent Tokens
(patterns derived from the data)
Our Hybrid
Approach
Learning Improves
Expressivity Increases
30
@hemant_pt
Intent Classification Top-Down: Binary Classifier - Prior Knowledge
� Conceptual Dependency Theory [Schank, 1972] � Make meaning independent from the actual words in input
� e.g., Class in an Ontology abstracts similar instances
� Verb Lexicon [Hollerit et al. 2013] � Relevant Levin’s Verb categories [Levin, 1993]
� e.g., give, send, etc.
� Syntactic Pattern � Auxiliary & modals: e.g., ‘be’, ‘do’, ‘could’, etc. [Ramanand et al. 2010] � Word order: Verb-Subject positions, etc.
Purohit, Hampton, Bhatt, Shalin, Sheth & Flach. In Journal of CSCW, 2014 31
@hemant_pt
Intent Classification Top-Down: Binary Classifier – Psycholinguistic Rules
� Transform knowledge into rules
� Examples:
(Pronouns except 'you' = yes) ^ (need/want = yes) ^ (Adjective = yes/no) ^ (Things=yes) → Seeking
(Pronoun except 'you' | Proper Noun = yes) ^ (can/could/would/should = yes) ^ (Levin Verb = yes) ^ (Determiner = yes/no) ^ (Adjective = yes/no) ^ (Things = yes) -> Offering
Domain ontology
32 Purohit, Hampton, Bhatt, Shalin, Sheth & Flach. In Journal of CSCW, 2014
@hemant_pt
Intent Classification Top-Down: Binary Classifier - Lessons
� Preliminary Study � 2000 conversation and then rule-based classified tweets:
labeled by two native speakers � Labels: Seeking, Offering, None
� Results � Avg. F-1 score: 78% (Baseline F-1 score: 57% [Varga et al. 2013] )
� Lessons � Role of prior knowledge: Domain Independent & Dependent � Limitation: Exhaustive rule-set, low Recall, Ambiguity
addressed, but sparsity
Purohit, Hampton, Bhatt, Shalin, Sheth & Flach. In Journal of CSCW, 2014 33
@hemant_pt
TOP-DOWN
Pattern Rules:
Declarative Knowledge
BOTTOM-UP
Bag of N-grams Tokens: Independent Tokens
Hybrid Approach
34
@hemant_pt
Intent Classification Hybrid: Binary Classifier - Design
� AMBIGUITY: addressed via rich feature space 1. Top-Down: Declarative Knowledge Patterns [Ramanand et al. 2010]
DK(mi, P) ! {0,1} e.g., P= \b(like|want) \b.*\b(to)\b.*\b(bring|give|help|raise|donate)\b
(acquired via Red Cross expert searches)
2. Abstraction: due to importance in info sharing [Nagarajan et al. 2010] - Numeric (e.g., $10) à _NUM_ - Interactions (e.g., RT & @user) à _RT_ , _MENTION_
- Links (e.g., http://bit.ly) ! _URL_
3. Bottom-Up: N-grams after stemming and abstraction [Hollerit et al. 2013]
TOKENIZER ( mi ) à { bi-, tri-gram }
35
@hemant_pt
Intent Classification Hybrid: Binary Classifier - Design
� SPARSITY: addressed via algorithmic choices 1. Feature Selection 2. Ensemble Learning 3. Classifier Chain
36
DATASET
Knowledge-driven features
XT
, y
m_1
m_2
P(c2)
P(c1) X1
T, y1
X2T, y2
1 - P(c1)
@hemant_pt
Intent Classification Hybrid: Binary Classifier - Experiments
� Binary classifiers: � Seeking vs. not Seeking � Offering vs. not Offering
� Dataset: � Candidate set: 4000 donation classified tweets
� Labels: min. 3 judges � Annotations: Seeking , Offering , None
37 Purohit, Castillo, Diaz, Sheth, & Meier. First Monday journal, 2014
@hemant_pt
Intent Classification Hybrid: Binary Classifier - Results
Experiments
Supervised Learning
Training Samples
Precision (*Baseline)
F-1 score
Class-labels
Seeking vs. (None’ + Offering)
RF (CR=50:1)
3836 98% (*79%)
46% (56%)
56% requests
Offering vs. (None’) RF (CR=9:2)
1763 90% (*65%)
44% (*58%)
13% offers
RF = Random Forest ensemble CR = Asymmetric false–alarm Cost Ratios for True:False Evaluation : 10-fold CV
Notes:
- Domain requires high precision than recall
- Scope for improving low recall
38 Purohit, Castillo, Diaz, Sheth, & Meier. First Monday journal, 2014
@hemant_pt
Intent Classification Hybrid: Multiclass Classifier - Generalization
� Lessons from binary classification � Improvement by fusing top-down & bottom-up � Sparsity � Ambiguity (Seeking & Offering complementary)
� addressed via improved data representation
Hypothesis: Knowledge-guided approach improves multiclass classification accuracy
39
@hemant_pt
TOP-DOWN
Knowledge Patterns
(DK) Declarative
(SK) Social Behavior
(CTK, CSK) Contrast Patterns
BOTTOM-UP
Bag of N-grams Tokens: (T) Independent Tokens
Hybrid Approach
40
@hemant_pt
Intent Classification Hybrid: Multiclass Classifier – Feature Creation 1. (T) Bag of Tokens -
2. (DK) Declarative Knowledge Patterns � Domain expert guidance � Psycholinguistics syntactic & semantic rules
� Expand by WordNet and Levin Verbs
e.g.,
3. (SK) Social Knowledge Indicators � Offline conversation indicators studied in Problem 1 e.g., Hj = Dialogue Management, Hj-set = {Thanks, anyway,..}
41
(how = yes) ^ (Modal-Set 'can' = yes) ^ (Pronouns except 'you' = yes) ^ (Levin Verb-Set 'give' = yes)
Feature_Hj (mi) = term-frequency ( Hj-set, mi )
Pj = Feature_Pj (mi) = 1 if Pj exists in mi , else 0
TOKENIZER(mi , min, max)
@hemant_pt
Intent Classification Hybrid: Multiclass Classifier - Feature Creation 4. (CTK) Contrast Knowledge Patterns
INPUT: corpus {mi} cleaned and abstracted, min. support, X For each class Cj
� Find contrasting pattern using sequential pattern mining
OUTPUT: contrast patterns set {P} for each class Cj
5. (CPK) Contrast Patterns: on Part-of-Speech tags of {mi}
42
e.g., unique sequential patterns: SEEKING: help .* victim .* _url_ .* OFFERING: anyon .* know .* cloth .*
@hemant_pt
Intent Classification Hybrid: Multiclass Classifier - Feature Creation Finding CTK: Contrast Knowledge Patterns
For each class Cj 1. Tokenize the cleaned, abstracted text of {mi } 2. Mine Sequential Patterns: SPADE Algorithm
� - Output: sequences of token sets, {P’}
3. Reduce to minimal sequences {P}
4. Compute growth rate & contrast strength for P with all other Ck
5. Top-K ranked {P} by contrast strength
OUTPUT: contrast patterns set {P} for each class Cj 43
gr(P,Cj,Ck) = support (P,Cj) / support (P,Ck) .. (1)
Contrast-Growth (P,Cj,Ck) = 1/(|Cj| -1) ΣCk, k=/=j gr(P,Cj,Ck)/ (1 + gr(P,Cj,Ck)) ..(2)
Contrast-Strength(P,Cj) = support(P,Cj)*Contrast-Growth(P,Cj,Ck) .. (3)
@hemant_pt
CORPUS
Set of short text
documents,
S
FEATURES
Knowledge-driven features
XT
, y
M_1
M_2
M_K
...Subset Xj
T ⊂ S such that, XjT includes
all the labeled instances of class Cj for model M_j
Binarization Frameworks for Multiclass Classifier: 1 vs. All
P(c2)
P(c1) X1
T, y1
X2T, y2
XKT, yK P(cK)
44 (In 1 vs. 1 framework: K*(K-1)/2 classifiers, for each Cj,Ck pair)
@hemant_pt
Intent Classification Hybrid: Multiclass Classifier - Experiments
� Datasets
� Dataset-1: Hurricane Sandy, Oct 27 – Nov 7, 2012 � Dataset-2: Philippines Typhoon, Nov 7 – Nov 17, 2013
� Parameters � Base Learner M_j: Random Forest, 10 trees with 100 features � bi-, tri-gram for (T) � K=100% & min. support 10% for CTK, 50% for CPK
45
@hemant_pt
Intent Classification: Multiclass Classifier – Results
46
56% 58% 60% 62% 64% 66% 68% 70%
T (Baseline)
T,DK
T,SK
T,CTK,CSK
T,DK,SK,CTK,CSK
1-vs-1
1-vs-All
Avg. F-1 Score (10-fold CV)
Frameworks:
Gain 7%, p < 0.05
Dataset-1 (Hurricane Sandy, 2012)
(Declarative)
(Social)
(Contrast)
@hemant_pt
74% 76% 78% 80% 82% 84% 86%
T (Baseline)
T,DK
T,SK
T,CTK,CSK
T,DK,SK,CTK,CSK
1-vs-1
1-vs-All
Intent Classification: Multiclass Classifier - Results
47
Frameworks:
Gain 6%, p < 0.05
Dataset-2 (Philippines Typhoon, 2013)
(Declarative)
(Social)
(Contrast)
Avg. F-1 Score (10-fold CV)
@hemant_pt
Lessons 1. Top-down & Bottom-up hybrid approach improves data
representation for learning (complementary) intent classes � Top 1% discriminative features contained 50% knowledge driven
2. Offline theoretic social conversation (SK) features (the, thanks, etc.), often removed for text classification are valuable for intent.
3. There is a varying effect of knowledge types (SK vs. DK vs. CTK/CPK) in different types of real world event datasets Ø Culturally-sensitive psycholinguistics knowledge in future
48
@hemant_pt
Outline
� Citizen Sensor Communities & Organizations
� Cooperative System Design Challenges � Awareness: tackle via Engagement Modeling � Articulation: tackle via Intent Mining
� Contributions � Problem 1. Conversation Classification using Offline Theories � Problem 2. Intent Classification � Problem 3. Engagement Modeling
� Applications
� Limitations & Future Work
49
@hemant_pt
Thesis: Statement
Prior knowledge, and
interplay of features of users, their content, and network
efficiently model
Intent & Engagement
for cooperation of citizen sensor communities.
50
@hemant_pt
� Engagement: degree of involvement in discussion
� Reliable groups: stay focused and collectively behave to diverge on topics
Problem 3. Group Engagement Model
51 Purohit, Ruan, Fuhry, Parthasarathy, & Sheth. ICWSM 2014
How can organizations find reliable groups to engage for action?
@hemant_pt
� Engagement: degree of involvement in discussion
� Reliable groups: stay focused and collectively behave to diverge on topics
� Why & How do groups collectively evolve over time? 1. Define a group from interaction network, g
2. Define Divergence of g: content based in contrast to structure
3. Predict change in the divergence between time slices � Features of g based on theories of social identity, & cohesion
Problem 3. Group Engagement Model
52 Purohit, Ruan, Fuhry, Parthasarathy, & Sheth. ICWSM 2014
@hemant_pt
Group Engagement Model: Integrated Approach Unlike Prior Work
People (User): Participant
of the discussion
Content (Text): Topic of Interest
Network (Community):
Group around topic
AND
AND
Sources: tupper-lake.com/.../uploads/Community.jpg http://www.iconarchive.com/show/people-icons-by-aha-soft/user-icon.html
KEY POINT: capture User Node Diversity
53
@hemant_pt
� Candidate Group: Detect in interaction network
� Group Discussion Divergence: Jenson-Shannon Divergence of topic distribution on group members’ tweets
Group Engagement Model: Discussion Divergence
where, H(*) = Shannon Entropy
Bt = Latent topic distribution of each tweet t in all members’ tweets |Tg| ,
Bg = mean topic distribution of group g, such that:
54
@hemant_pt
Lessons 1. Content Divergence based measure helps explanation of
why groups collectively diverge � Less diverging group write more social & future action related
content
2. Emerging events such as disasters have higher correlation with social identity-driven features Ø Role of social context
55
@hemant_pt
Outline
� Citizen Sensor Communities & Organizations
� Cooperative System Design Challenges � Awareness: tackle via Engagement Modeling � Articulation: tackle via Intent Mining
� Contributions � Problem 1. Conversation Classification using Offline Theories � Problem 2. Intent Classification � Problem 3. Engagement Modeling
� Applications
� Limitations & Future Work
56
@hemant_pt
DISASTER Event
Application-1: Filter Content for Disaster Response
CITIZEN Sensors
RESPONSE Organizations
Me and @CeceVancePR are coordinating a clothing/food drive for families affected by Hurricane Sandy. If you would like to donate, DM us
Does anyone know how to donate clothes to hurricane #Sandy victims?
[SEEKING
[OFFERING
Intent-Classifiers as a Service
57
@hemant_pt
Broader Impact: Classifier Model integrated by Crisis Mapping Pioneer
58
@hemant_pt
DISASTER Event
Application-2: “We TRUST people!” User engagement tool
CITIZEN Sensors
RESPONSE Organizations
Tool to mine Important
users
59
@hemant_pt
Broader Impact: Winner of Int’l Challenge: UN ITU Young Innovators 2014
60
@hemant_pt
Articulation
ENGAGEMENT MODELING INTENT MINING
COOPERATIVE SYSTEM
61
ORGANIZATIONS CITIZEN SENSOR COMMUNITIES
Awareness
Q1. Who to engage first?
Org. Actor
Q2. What are Resource needs &
availabilities?
Org. Actor
@hemant_pt
Limitations & Future Work
� Cooperative System � CSCW Application specific to domain of crisis Ø How to create a full What-Where-When-Who knowledge base
� Intent Mining � Non-cooperation assistive intent classes not considered, as well as
the temporal drift of intent not considered Ø How to mine actor-level intent beyond document level
� Group Engagement � Reliable prioritized groups based on Correlation, not Causality � Interplay of Offline and Online interactions beyond the scope Ø How to incorporate intent in the group divergence
� Bipartite Intent Graph Matching � Reducing time complexity of Seeking vs. Offering matching
62
@hemant_pt
Conclusion
Prior knowledge, and
interplay of features of users, their content, and network
efficiently model
Intent & Engagement
for cooperation between citizen sensors and organizations in the online social communities.
63
@hemant_pt
Thanks to the Committee Members
64
[Left to Right] Prof. Amit Sheth, (advisor, WSU), Prof. Guozhu Dong (WSU), Prof. Srinivasan Parthasarathy (OSU), Prof. TK Prasad (WSU), Dr. Patrick Meier (QCRI), Prof. Valerie Shalin (WSU)
Computer Science Social Science
@hemant_pt
Acknowledgement, Thanks and Questions J
� NSF SoCS grant IIS-1111182 to support this work
� Interdisciplinary Mentors especially Prof. John Flach (WSU), Drs. Carlos Castillo (QCRI), Fernando Diaz (Microsoft), Meena Nagarajan (IBM)
� Kno.e.sis team especially Andrew Hampton from Psychology dept. and Shreyansh and Tanvi from CSE at Wright State, as well as Yiye Ruan (now Google) & David Fuhry at the Data Mining Lab, Ohio State University
� Colleagues: Digital Volunteers from the CrisisMappers network, StandBy Task Force, InCrisisRelief.org, info4Disasters, Humanity Road, Ushahidi, etc. and the subject matter experts at UN FPA
65
@hemant_pt
Ambiguity
Sparsity
Diversity
Scalability
• Mutual Influence in Sparse Friendship Network [AAAI ICWSM’12]
• User Summarization with
Sparse Profile Metadata [ASE SocialInfo’12]
• Matching intent as task of Information Retrieval [FM’14]
• Knowledge-aware Bi-partite
Matching [In preparation]
• Short-Text Document Intent Mining [FM’14, JCSCW’14]
• Actor-Intent Mining
Complexity [In preparation]
• Modeling Group Using Diverse Social Identity & Cohesion [AAAI ICWSM’14]
• Modeling Diverse User-Engagement [SOME WWW’11, ACM WebSci’12]
(Interpretation)
(users)
(behaviors)
66
Other works
Top Related