Multi-scale Real-time Grid Monitoring with Job Stream Mining

52

Transcript of Multi-scale Real-time Grid Monitoring with Job Stream Mining

Page 1: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsMulti-s ale Real-time Grid Monitoringwith Job Stream MiningXiangliang Zhang, Mi hele Sebag, Ce ile Germain-RenaudTAO − INRIA CNRSUniversité de Paris-Sud, F-91405 Orsay Cedex, Fran e21 May 2009Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 2: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsContents1 Monitoring system: Grid adapted StrAP2 Streaming Jobs3 Monitoring OutputsMonitoring on short-time s aleClustering QualityMonitoring on medium-time s aleMonitoring on large-time s aleXiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 3: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsContents1 Monitoring system: Grid adapted StrAP2 Streaming Jobs3 Monitoring OutputsMonitoring on short-time s aleClustering QualityMonitoring on medium-time s aleMonitoring on large-time s aleXiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 4: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsMulti-s ale Realtime Grid Monitoring System

Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 5: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsMulti-s ale Realtime Grid Monitoring System

1 2 3 4 50

20

40

60

80

100

700000

10 47 54129 0 0

8 18 24 30595139

7 13 14 24 972819190

Per

cent

age

of jo

bs a

ssig

ned

(%)

Outliers

Clusters

exemplar shown as a job vector

Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 6: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsMulti-s ale Realtime Grid Monitoring System

0 20 40 60 80 100 120 140 1600

5

10

15

20

25

30

days

perc

enta

ge o

f job

s (%

)

distirbution of jobs like [7 0 0 0 0 0]

0 20 40 60 80 100 120 140 1600

10

20

30

40

50

60

70

80

90

days

perc

enta

ge o

f job

s (%

)

distirbution of jobs like [0 0 0 0 0 0]

Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 7: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsMulti-s ale Realtime Grid Monitoring SystemA�nity Propagation (AP)A lustering method: group similar points togetherStrAP (Streaming AP)Online Clustering streaming data based on APXiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 8: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsWhy AP ??A�nity Propagation (AP)A lustering methodGroup similar points togetherConverge by Iterations of Message passing� > more stable resultsNo need of K (the number of lusters)� > less prior knowledgeA real point as an exemplar to represent a luster� > avoid meaningless averaged entersClustering by Passing Messages Between Data Points. B.J. Frey, D. Due k. S ien e 2007Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 9: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsHow AP works ??

Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 10: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsHow AP works ??

Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 11: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsHow AP works ??

Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 12: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsHow AP works ??

Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 13: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsHow AP works ??

Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 14: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsHow AP works ??

Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 15: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsHow AP works ??

Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 16: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsHow AP works ??

Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 17: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsGrid adapted StrAPGrid adapted StrAP (Streaming AP):Online lustering streaming jobs� > one-s an of the streamIn remental update of model� > keep tra king the streamDete ting distribution hanges in stream� > absorb new patternsData streaming with A�nity propagation. Xiangliang Zhang, Cyril Furtlehner, Mi hele Sebag. ECML2008.Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 18: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsStream lusteringe e e i i e i i e e i iModel Reservoireeeeeeef jjjiiiij

Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 19: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsStream lusteringe e e i i e i i e e i i eModel Reservoireeeeeeefeeeeeeef jjjiiiijDoes xt �t the urrent model ??if yes, update the modelotherwise, go to reservoir

Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 20: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsStream lusteringe e e i i e i i e e i i e iModel Reservoireeeeeeef jjjiiiijjjjiiiijDoes xt �t the urrent model ??if yes, update the modelotherwise, go to reservoir

Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 21: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsStream lusteringe e e i i e i i e e i i e i�@Model Reservoireeeeeeef jjjiiiij �@Does xt �t the urrent model ??if yes, update the modelotherwise, go to reservoir

Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 22: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsStream lusteringe e e i i e i i e e i i e i i e�@ i e� �@ @ �@Model Reservoireeeeeeef jjjiiiij � � �@ @ @Has the distribution hanged ??CHANGE TESTif yes, rebuilt the modelotherwise, ontinueXiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 23: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsStream lusteringe e e i i e i i e e i i e i�@ i e� �@ @ �@Model Reservoireeeeeeef jjjiiiij�@Has the distribution hanged ??CHANGE TESTif yes, rebuilt the modelotherwise, ontinueXiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 24: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsOur ModelOutputei , the exemplar ( enter of luster)ni , size of lusterΣi , average distan e of points to their exemplarT , time stamp when the luster was latterly visitedParametersǫ, threshold of omparing ea h point with model (set to around value of Σi in the initial model)∆, de ay window (de rease the weight of old exemplars)Page-Hinkley parameters ( hange dete tion)Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 25: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsContents1 Monitoring system: Grid adapted StrAP2 Streaming Jobs3 Monitoring OutputsMonitoring on short-time s aleClustering QualityMonitoring on medium-time s aleMonitoring on large-time s aleXiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 26: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsEGEE (Enabling Grids for E-s ien E)Funded by European Commission( ontribution: 32,000,000 euro)Start in April 2004Grid infrastru ture availableto s ientists 24 hours-a-day.http://publi .eu-egee.org/Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 27: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsEGEE JobsEGEE logs of 39 RBs during 5 months (2006-01-012006-05-31) olle ted by Real Time Monitor (RTM) system(http://gridportal.hep.ph.i .a .uk/rtm/)5,268,564 jobsfor ea h job, its�nal status (good or type of errors)UI, RB, CEtime stamps of every servi es happenedXiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 28: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsJob attributesregistration_Time: time for registering the jobmat h_Time: time to �nd a mat hing resour eupto_s heduled_transfer_Time: time a eptation and transfer (waiting + readytime), as reported by the JobController (JC)upto_s heduled_a eptan e_Time: the same as Ready_for_Transfer_Time, butas reported by the LogMonitor (LM)logmonitor_ e_s heduled_Time: time job waiting in a queuelogmonitor_wn_Time: exe ution timeXiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 29: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsMulti-s ale Realtime Grid Monitoring System

Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 30: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsPre-pro essing and NormalizationPre-pro essing6 boolean attributesindi ate whether the servi es were rea hed or notNormalizationby entering with standard deviation 1job xi is normalized to x ′i = xi−µswhere, µ and s are mean and standard deviation from a part ofstreams.Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 31: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring OutputsLoad of jobs per day

20 40 60 80 100 120 1401

2

3

4

5

6

7

8x 10

4

Days

Number of jobs per day

Sat & Sun

Mon

Tue

Wed

Thu

Fri

line

Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 32: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring Outputs Monitoring on short-time s aleClustering QualityMonitoring on medium-time s aleMonitoring on large-time s aleContents1 Monitoring system: Grid adapted StrAP2 Streaming Jobs3 Monitoring OutputsMonitoring on short-time s aleClustering QualityMonitoring on medium-time s aleMonitoring on large-time s aleXiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 33: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring Outputs Monitoring on short-time s aleClustering QualityMonitoring on medium-time s aleMonitoring on large-time s aleMonitoring on a short-time s ale

Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 34: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring Outputs Monitoring on short-time s aleClustering QualityMonitoring on medium-time s aleMonitoring on large-time s aleReal-time Monitoring: when hange dete ted

1 2 3 4 50

10

20

30

40

50

60

70

80

90

100

Reservoir

700000

10 47 54129 0 0

8 18 24 30595139

7 13 14 24 972819190

Clusters

Per

cent

age

of jo

bs a

ssig

ned

(%)

exemplar shown as a job vector

Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 35: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring Outputs Monitoring on short-time s aleClustering QualityMonitoring on medium-time s aleMonitoring on large-time s aleReal-time Monitoring: when hange dete ted

1 2 3 4 5 6 7 80

10

20

30

40

50

60

70

80

90

100the assignment of jobs between restart 1 and restart 2

Reservoir

700000

10 47 54129 0 0

90 3 5 8220199

8 18 24 30595139

6 5 10 14 12710854

7 13 14 24 972819190

7 18 34 3950190 4619Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 36: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring Outputs Monitoring on short-time s aleClustering QualityMonitoring on medium-time s aleMonitoring on large-time s aleReal-time Monitoring: when hange dete ted

1 2 3 4 5 6 7 80

10

20

30

40

50

60

70

80

90

100the assignment of jobs between restart 2 and restart 3

Reservoir

700000

10 47 54129 0 0

90 3 5 8220199

8 18 24 30595139

6 5 10 14 12710854

7 13 14 24 972819190

14 8 13 205588316076Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 37: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring Outputs Monitoring on short-time s aleClustering QualityMonitoring on medium-time s aleMonitoring on large-time s aleReal-time Monitoring: when hange dete ted

1 2 3 4 5 6 7 80

10

20

30

40

50

60

70

80

90

100the assignment of jobs between restart 3 and restart 4

Reservoir

700000

10 47 54129 0 0 90

3 5 8220199

8 18 24 30595139

6 5 10 14 12710854

50 16 23 12036311 4081

7 18 34 3950190 4619Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 38: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring Outputs Monitoring on short-time s aleClustering QualityMonitoring on medium-time s aleMonitoring on large-time s aleReal-time Monitoring: when hange dete ted

1 2 3 4 5 6 7 80

10

20

30

40

50

60

70

80

90

100the assignment of jobs between restart 4 and restart 5

Reservoir

700000

10 47 54129 0 0

24 154 1909395 0 0

90 3 5 8220199

8 18 24 30595139

24 150 1879392 314 611

6 5 10 14 12710854Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 39: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring Outputs Monitoring on short-time s aleClustering QualityMonitoring on medium-time s aleMonitoring on large-time s aleReal-time Monitoring: when hange dete ted

1 2 3 4 5 6 7 80

10

20

30

40

50

60

70

80

90

100the assignment of jobs between restart 5 and restart 6

Reservoir

000000

700000

10 47 54129 0 0

9 18 2520110 0 0

8 18 24 30595139

6 5 10 14 12710854

10 18 2920091 395 276

LogMonitor is getting logged.Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 40: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring Outputs Monitoring on short-time s aleClustering QualityMonitoring on medium-time s aleMonitoring on large-time s aleWho is responsible for the logging ??Distribution of Attr4/Attr3Distributionof alljobs over39 RBsDistributionof jobsfrom9-th RB0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

x 106

0

1

2

3

4

5

6

7

8x 10

4

jobs

Att4

/Att3Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 41: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring Outputs Monitoring on short-time s aleClustering QualityMonitoring on medium-time s aleMonitoring on large-time s aleWho is responsible for the loggong ??Whi h RB ??0 5 10 15 20 25 30 35 40

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

RBs

Cor

rela

tion

coef

ficie

nts

gdrb04.****.ch

gdrb03.****.chlappgrid07.****.fr

Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 42: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring Outputs Monitoring on short-time s aleClustering QualityMonitoring on medium-time s aleMonitoring on large-time s aleClustering Quality Assessment

Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 43: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring Outputs Monitoring on short-time s aleClustering QualityMonitoring on medium-time s aleMonitoring on large-time s aleClustering PurityPurity = 100% × (∑Ki=1 |Cdi |

|Ci | )/Kwhere K is number of lusters,|Ci | is size of luster i ,|C di | is number of majority lass items in luster i .

0 100 200 300 400 50080

85

90

95

100

Ave

rage

d pu

rity

of e

ach

clus

ter

(%)

Restarts

0 50 100 150 200 250 300 350 400 450 500 550050100150200250300

Num

ber

of c

lust

ers

Number of clustersAveraged purity of each cluster

Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 44: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring Outputs Monitoring on short-time s aleClustering QualityMonitoring on medium-time s aleMonitoring on large-time s aleDis ussReal-time quality:on average 10000 jobs in 1 minute vs maximum load:80000 per dayIntel 2.66GHz Dual-Core PC with 2 GB memory oding inmatlabon average 60000 jobs in 1 minute oding in C/C++ ompa t and live des ription of job patternsproportion of good jobs and failed jobsdi�erent time ost of servi es the jobs went throughXiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 45: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring Outputs Monitoring on short-time s aleClustering QualityMonitoring on medium-time s aleMonitoring on large-time s aleMonitoring on a medium-time s ale

Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 46: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring Outputs Monitoring on short-time s aleClustering QualityMonitoring on medium-time s aleMonitoring on large-time s aleRupture steps0 20 40 60 80 100 120 140 160

0

2

4

6

8

10

12

days

num

ber

of r

esta

rts

per

day

keep tra king the evolving of job distributionprovides intuitive view of grid regime and its stabilityXiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 47: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring Outputs Monitoring on short-time s aleClustering QualityMonitoring on medium-time s aleMonitoring on large-time s aleMonitoring on a large-time s ale

Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 48: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring Outputs Monitoring on short-time s aleClustering QualityMonitoring on medium-time s aleMonitoring on large-time s aleLarge-time s ale Monitoring: Global view

0 20 40 60 80 100 120 140 1600

5

10

15

20

25

30

days

perc

enta

ge o

f job

s (%

)

distirbution of jobs like [7 0 0 0 0 0]

0 20 40 60 80 100 120 140 1600

10

20

30

40

50

60

70

80

90

days

perc

enta

ge o

f job

s (%

)

distirbution of jobs like [0 0 0 0 0 0]

Clustering the exemplars �> Super exemplarsSuper lusters: Cluster of exemplarsthe history behavior of these super lustersXiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 49: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring Outputs Monitoring on short-time s aleClustering QualityMonitoring on medium-time s aleMonitoring on large-time s aleBad Super Examples: day viewDays

Super Clusters

20 40 60 80 100 120 140

2

4

6

8

10

12

14

16

18

20 0

10%

20%

30%

40%

50%

60%

70%

80%

90%

Re- he k of �early stopped error� type of errors (�rst row)Date Jan 7∼13 Jan 30 ∼ Feb 3 Mar 16∼21 May 17∼19UI A1 A1 B1 D1 and A1Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 50: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring Outputs Monitoring on short-time s aleClustering QualityMonitoring on medium-time s aleMonitoring on large-time s aleDis ussion and Con lusionreal-time monitoring Grid job streamsproviding multi-s ale models to des ribing the status of Gridproportion of di�erent type of job patterns (realtime-view,day-view, week-view ....)rupture stepso�ine globally analysisgood quality lustering is guaranteed

Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 51: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring Outputs Monitoring on short-time s aleClustering QualityMonitoring on medium-time s aleMonitoring on large-time s aleFuture workmore omprehensive des ription of the jobs, e.g., related to UIand CEinterpret the model dynami s, e.g., relating the rebuildfrequen y to alendar or so ial events, in ollaboration withthe operation teams.

Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining

Page 52: Multi-scale Real-time Grid Monitoring with Job Stream Mining

Monitoring system: Grid adapted StrAPStreaming JobsMonitoring Outputs Monitoring on short-time s aleClustering QualityMonitoring on medium-time s aleMonitoring on large-time s aleThank youQestions ??

Xiangliang Zhang, Mi hele Sebag, Ce ile Germain-Renaud Grid Monitoring with Job Stream Mining