Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee...

42
Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1 , Gene Moo Lee 1 , Nick Duffield 2 , Lili Qiu 1 , Jia Wang 2 The University of Texas at Austin 1 , AT&T Labs – Research 2

Transcript of Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee...

Page 1: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

Event Detection using Customer Care Calls

04/17/2013 IEEE INFOCOM 2013

Yi-Chao Chen1, Gene Moo Lee1, Nick Duffield2, Lili Qiu1, Jia Wang2

The University of Texas at Austin1, AT&T Labs – Research2

Page 2: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

2

Motivation

INFOCOM 2013

Customer care call is a direct channel between service provider and customers Reveal problems observed by

customers Understand impact of network events

on customer perceived performanceRegions, services, group of users, …

Page 3: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

3

Motivation

INFOCOM 2013

Service providers have strong motivation to understand customer care calls Reduce cost: ~$10 per call Prioritize and handle anomalies by

the impact Improve customers’ impression to the

service provider

Page 4: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

4

Problem Formulation

INFOCOM 2013

Goal Automatically detect anomalies using

customer care calls. Input

Customer care calls Call agents label calls using predefined

categories~10,000 categroies

A call is labeled with multiple categories Issue, customer need, call type, problem

resolution

Page 5: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

5

Problem Formulation

INFOCOM 2013

Output Anomalies Performance problems observed by

customers Different from traditional network

metrics, e.g.video conference: bandwidth is not

important as long as it’s higher than some threshold

online games: 1ms -> 10ms delay could be a big problem.

Page 6: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

Example: Categories of Customer Care Calls

12345670

20000

40000

60000

80000

100000

Equipment Call Dis-connected

Cannot make calls

Educated - How to use

Equipment inquiry

Feature

Service Plan

Week

Norm

alized

Call V

olu

me

Page 7: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

7

Input: Customer Care Calls

INFOCOM 2013

T1 T2 T3 TM…

# of calls of category n in time bin t

Page 8: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

8

Example: Anomaly

INFOCOM 2013

10-M

ay

12-M

ay

14-M

ay

16-M

ay

18-M

ay

20-M

ay

22-M

ay

24-M

ay

26-M

ay

28-M

ay

30-M

ay1-

Jun3-

Jun5-

Jun7-

Jun

Power outageService outage due to DOS attack

Low bandwidthdue to maintenance

Output: anomaly indicator = [ 0 0 0 1 0 0 0 0 0 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 ]

Page 9: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

Challenges

Customers respond to an anomaly in different ways.

Events may not be detectable by single category.

Single category is vulnerable to outliers There are thousands of categories.

111111122222223333333444444455555556666666770

0.2

0.4

0.6

0.8

1

Week

Norm

alized

call v

olu

me

9

Page 10: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

Challenges

Customers respond to an anomaly in different ways.

Events may not be detectable by single category.

Single category is vulnerable to outliers There are thousands of categories.

111111122222223333333444444455555556666666770

0.2

0.4

0.6

0.8

1

Week

Norm

alized

call v

olu

me

10

Page 11: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

Challenges

Customers respond to an anomaly in different ways.

Events may not be detectable by single category.

Single category is vulnerable to outliers There are thousands of categories.

111111122222223333333444444455555556666666770

0.2

0.4

0.6

0.8

1

Week

Norm

alized

call v

olu

me

Page 12: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

Challenges

Inconsistent classification across agents and across call centers

1 2 3 4 5 6 7 8 9 10 110

0.5

1

EquipmentEquipment problemTroubleshooting - Equipment

WeekNorm

alized

Call

Volu

me

Page 13: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

13

Our Approach

INFOCOM 2013

We use regression to approach the problem by casting it as an inference problem:

A(t,n): # calls of category n at time tx(n): weight of the n-th categoryb(t): anomaly indicator at time t

Page 14: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

14

Our Approach

• Training set: the history data that contains: A: Input timeseries of customer care calls b: Ground-truth of when anomalies take place x: The weight to learn

• Testing set: A’: The latest customer care call timeseries x: The weight learning from training set b’: The anomaly to detect

A bx =

A’ b’x =

Page 15: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

15

Issues

INFOCOM 2013

Dynamic x The relationship between customer care calls and

anomalies may change. Under-constraints

# categories can be larger than # training traces Over-fitting

The weights begin to memorize training data rather than learning the generic trend

Scalability There are thousands of categories and thousands of

time intervals. Varying customer response time

Page 16: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

16

Our Approach

INFOCOM 2013

Clustering

Identifying important categories

Temporal stability

Low-rank structure

Combining multiple

classifiers

Reducing Categories

Regression

Page 17: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

17

Clustering

INFOCOM 2013

Agents usually classify calls based on the textual names of categories. e.g. “Equipment”,

“Equipment problem”, and “Troubleshooting- Equipment”

Cluster categories based on the similarity of their textual names Dice’s coefficient

Clustering

Identifying important categories

Temporal

stability

Low-rank

structure

Combining multiple

classifiers

Reducing Categories

Regression

Page 18: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

18

Identify Important Categories

INFOCOM 2013

L1-norm regularization Penalize all factors in x

equally and make x sparse

Select categories with corresponding value in x is not 0

Clustering

Identifying important categories

Temporal

stability

Low-rank

structure

Combining multiple

classifiers

Reducing Categories

Regression

Page 19: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

19

Regression

INFOCOM 2013

Impose additional structures for under-constraints and over-fitting The weight values are

stable Small number of factors of

dominate anomalies

Fitting Error

Clustering

Identifying important categories

Temporal

stability

Low-rank

structure

Combining multiple

classifiers

Reducing Categories

Regression

Page 20: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

20

Regression

Temporal stability The weight values are

stable across consecutive days.

Clustering

Identifying important categories

Temporal

stability

Low-rank

structure

Combining multiple

classifiers

Reducing Categories

Regression

Page 21: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

21

Regression

INFOCOM 2013

Low-rank structure The weight values exhibit

low-rank structure due to the temporal stability and the small number of dominant factors that cause the anomalies.

Clustering

Identifying important categories

Temporal

stability

Low-rank

structure

Combining multiple

classifiers

Reducing Categories

Regression

Page 22: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

22

Regression

INFOCOM 2013

Find the weight X that minimize: Clustering

Identifying important categories

Temporal

stability

Low-rank

structure

Combining multiple

classifiers

Reducing Categories

Regression

Page 23: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

23

Combining Multiple Classifiers

INFOCOM 2013

What time scale should be used? Customers do not

respond to an anomaly immediately

The response time may differ by hours

Include calls made in previous n (1~5) and next m (0~6) hours as additional features.

Clustering

Identifying important categories

Temporal

stability

Low-rank

structure

Combining multiple

classifiers

Reducing Categories

Regression

Page 24: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

Evaluation

Dataset Customer care calls: data from a large network

service provider in the US during Aug. 2010~ July 2011

Ground-truth Events: all events reported by Call Centers, Network Operation Centers, and etc.

Metrics Precision: the fraction of claimed anomalies

which are real anomalies Recall: the fraction of real anomalies are claimed F-score: the harmonic mean of precision and recall

24

Page 25: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

25

Evaluation –Identifying Important Features

Precision Recall F-score0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

L1-normL2-norm PCA rand 2000

Page 26: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

26

Evaluation –Identifying Important Features

Precision Recall F-score0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

L1-normL2-norm PCA rand 2000

Page 27: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

27

Evaluation –Identifying Important Features

Precision Recall F-score0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

L1-normL2-norm PCA rand 2000

Page 28: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

28

Evaluation –Identifying Important Features

Precision Recall F-score0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

L1-normL2-norm PCA rand 2000

L1-norm out-performs rand 2000, PCA, and L2-norm by

454%, 32%, and 10%

Page 29: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

29

Evaluation – Regression

Precision Recall F-score0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Fit+Temp+LRFitRand 0.3

Page 30: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

30

Evaluation – Regression

Precision Recall F-score0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Fit+Temp+LRFitRand 0.3

Page 31: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

31

Evaluation – Regression

Precision Recall F-score0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Fit+Temp+LRFitRand 0.3

Fit+Temp+LR out-performs Random and Fit by 823% and

64%

Page 32: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

32

Evaluation –Number of Classifiers

Precision Recall F-score0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

503010521

Page 33: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

33

Evaluation –Number of Classifiers

Precision Recall F-score0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

503010521

Page 34: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

34

Evaluation –Number of Classifiers

Precision Recall F-score0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

503010521

use 30 classifiers to trade off the benefit and cost.

Page 35: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

35

Contribution

INFOCOM 2013

Propose to use customer care calls as a complementary source to network metrics. A direct measurement of QoS perceived

by customers Develop a systematic method to

automatically detect events using customer care calls. Scale to a large number of features Robust to the noise

Page 36: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

Thank you!

IEEE INFOCOM 2013

[email protected]

Page 37: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

37

Backup Slides

INFOCOM 2013

Page 38: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

38INFOCOM 2013

Page 39: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

39

Twitter

Leverage Twitter as external data Additional features Interpreting detected anomalies

Information from a tweet Timestamp Text

Term Frequency - Inverse Document Frequency (TF-IDF)

Hashtags: keyword of the topics Used as features e.g. #ATTFAIL

LocationINFOCOM 2013

Page 40: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

40

Interpreting Anomaly - Location

INFOCOM 2013

Page 41: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

Describing the Anomaly

Examples 3G network outage

Location: New York, NY Event Summary: service, outage, nyc, calls,

ny, morning, service Outage due to an earthquake

Location: East Coast Event Summary: #earthquake, working,

wireless, service, nyc, apparently, new, york Internet service outage

Location: Bay Area Event Summary: serviceU, bay, outage,

service, Internet, area, support, #fail

Page 42: Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.

42

How to Select Parameters

K-fold cross-validation Partition the training data into K equal

size parts. In round i, use the partition i for

training and the remaining k-1 partitions for testing.The process is repeated k times.

Average k results as the evaluation of the selected value

INFOCOM 2013