Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment...

32
of Rostock Universi ty DuDE: A Distributed Computing System using a Decentralized P2P Environment The 4th International Workshop on Architectures, Services and Applications for the Next Generation Internet (WASA-NGI-IV) Bonn, Germany, October 4th, 2011 J. Skodzik , P. Danielis, V. Altmann, J. Rohrbeck, D. Timmermann University of Rostock, Germany Institute of Applied Microelectronics and Computer Engineering T. Bahls, D. Duchow Nokia Siemens Networks Broadband Access Division Greifswald, Germany

Transcript of Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment...

Page 1: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

of RostockUniversity

DuDE: A Distributed Computing System using aDecentralized P2P Environment

The 4th International Workshop on Architectures, Services and Applications for the Next Generation Internet (WASA-NGI-IV)      

Bonn, Germany, October 4th, 2011

J. Skodzik, P. Danielis, V. Altmann, J. Rohrbeck, D. TimmermannUniversity of Rostock, Germany

Institute of Applied Microelectronicsand Computer Engineering

T. Bahls, D. DuchowNokia Siemens Networks

Broadband Access DivisionGreifswald, Germany

Page 2: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

2

Outline

• Introduction & Motivation

• DuDE in General

• The DuDE Algorithm in Detail

• Test Scenario and Evaluation

• Summary and Future Work

Page 3: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

3

2007 2008 2009 2010 20110

500

1000

1500

2000

2500

3000

0

50

100

150

200

250

300

Internet users and data traffic (AMS-IX)

# Internet users Data traffic Series3

Mil

lio

ns

of

Use

rs

Ave

rag

e M

on

thly

Dat

a T

raff

ic

[100

0 T

B]

• Increasing number of Internet users and traffic data

• Internet Service Providers (ISPs) want to ensure:• Quality of Service (QoS)• The detection of bottlenecks• The detection of attacks

• How to ensure these issues?

Statistics generated from existing log data

User

User

Internet

Access Node (AN)

Situation today

Does an AN have enough resources?Does it provide sufficient statistics at all?

Page 4: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

4

User

User

Internet

AN

60%60%

Introduction & Motivation

CPU MEM

Resources utilization

Page 5: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

5

Internet

AN 1

AN 2AN 3

AN 4

User

User

User

Internet

AN

Simple support of new statistics types

Simultaneous computation of multiple statistics

Processing of increasing log data volumes

• Processor utilization• RAM utilization• Drops• Number of packets

Short term statistics (STS) for single ANs

Supported

Introduction & Motivation

Not supported

Long term statistics (LTS)

Creation of global statistics

Page 6: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

6

Internet

AN 1

AN 2AN 3

AN 4

User

• One AN does not have enough hardware ressources

Usage of multiple ANs to compute statistics• Efficient resource sharing with high resilience and scalability

Utilization of P2P technology

DuDE: Exploitation of already available resources

No extra costs for additional equipment

Introduction & Motivation

Page 7: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

7

Internet

AN 1

AN 2AN 3

AN 4

User

Internet

AN 1

AN 2AN 3

AN 4

DuDE in General

Logical P2P ring

ID

ID

ID

ID

Node1

Node2

Node4

Node3

Page 8: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

8

Internet

AN 1

AN 2AN 3

AN 4

DuDE in General

Log data (some hundreds of KBs)

8

Data chunk (ca. 100 KBs)

Page 9: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

9

DuDE in General

• Objective: High log data availability = 99.999 % • Simple replication wastes memory ressources

Reed-Solomon Codes• Split log data of each AN into m data chunks

m Log Data Chunks

Split

Log Data

Page 10: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

10

DuDE in General

• Objective: High log data availability = 99.999 % • Simple replication wastes memory ressources

Reed-Solomon Codes• Split log data of each AN into m data chunks• Encoding: Add k interleaved coding chunks n=m+k chunks

Encoding

k Coding Chunksm Log Data

Chunks

Split

Log Data

Page 11: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

11

DuDE in General

• Objective: High log data availability = 99.999 % • Simple replication wastes memory ressources

Reed-Solomon Codes• Split log data of each AN into m data chunks• Encoding: Add k interleaved coding chunks n=m+k chunks• Decoding: Restore log data from any m of n chunks

Decoding

n = m+k Data-/CodingChunks, plus Erasures Log Data

Page 12: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

12

Internet

AN 1

AN 2AN 3

AN 4

DuDE in General

Log data (some hundreds of KBs)

Data chunk (ca. 100 KBs)

How to apply our application to P2P?

Page 13: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

13

Internet

AN 1

AN 2AN 3

AN 4

Internet

AN 1

AN 2AN 3

AN 4

Internet

AN 2AN 3

AN 4

AN 1

Task Task

Task

• Job = collection of STS and/or LTS tasks• Task = part of job, e.g., request for „CPU“ statistics• Jobscheduler (JS): Reception and monitoring of job• Taskwatcher (TW): Reception and processing of task

DuDE in General

Admin.

Job

Which steps are necessary to compute statistics?

Page 14: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

14

60% 60%

50% 50%

30% 30%

10% 10%

The DuDE Algorithm in DetailStage 1: Resource collection

Admin.

…Job …Task …Log data …Global statistics

Peter
Alle Phasen mit Substantiven:Resource collectionJobscheduler determination usw.
Page 15: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

15

The DuDE Algorithm in Detail

1. R

esou

rce

c

ollec

tion

Admin.

Stage 2: Jobscheduler determination

…Job …Task …Log data …Global statistics

Page 16: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

16

The DuDE Algorithm in Detail

Admin.

2. J

obsc

hedu

ler

d

eter

mina

tion

Stage 3: Resource re-collection

1. R

esou

rce

c

ollec

tion

…Job …Task …Log data …Global statistics

Page 17: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

17

The DuDE Algorithm in Detail

Admin.

3. R

esou

rce

re

-coll

ectio

n

Stage 4: Task assignment

2. J

obsc

hedu

ler

d

eter

mina

tion

1. R

esou

rce

c

ollec

tion

Request for Processor utilization STS

Request for RAM utilization LTS

Request for Drops LTS

…Job …Task …Log data …Global statistics

Page 18: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

18

The DuDE Algorithm in Detail

…Job …Task …Log data …Global statisticsAdmin.

3. R

esou

rce

re

-coll

ectio

n

Stage 4: Task assignment

2. J

obsc

hedu

ler

d

eter

mina

tion

1. R

esou

rce

c

ollec

tion

How to find all log data for global statistics computation?

Page 19: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

19

Node1 Node2

Node3Node5

no

The DuDE Algorithm in Detail

• Request for global statistics All data needed

ID ID

ID

Taskwatcher

i Succ? N N >=A ID

1

2

3

4

5

yes

yes

yes

no

no

0

0

0

1

2

no

no

no

no

yes

Node1

Node2

Node3

Node4

Node5

Algorithm is done

Global Peer Data Discovery Algorithm

- Threshold value A = 2

Algorithm is done

5 yes 0 no Node5

6 1 no Node6

7 yes 2 yes Node7

Page 20: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

20

The DuDE Algorithm in Detail

Admin.

4. Ta

sk

a

ssign

men

t

Stage 5: Log data collection

3. R

esou

rce

re

-coll

ectio

n

2. J

obsc

hedu

ler

d

eter

mina

tion

1. R

esou

rce

c

ollec

tion

…Job …Task …Log data …Global statistics

Page 21: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

21

The DuDE Algorithm in Detail

Admin.

5. L

og d

ata

c

ollec

tion

4. Ta

sk

a

ssign

men

t

3. R

esou

rce

re

-coll

ectio

n

2. J

obsc

hedu

ler

d

eter

mina

tion

1. R

esou

rce

c

ollec

tion

Stage 6: Statistics computation

Processor utilization stat.

RAM utilization stat.

Drops stat.

…Job …Task …Log data …Global statistics

Page 22: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

22

The DuDE Algorithm in Detail

Admin.

5. L

og d

ata

c

ollec

tion

4. Ta

sk

a

ssign

men

t

3. R

esou

rce

re

-coll

ectio

n

2. J

obsc

hedu

ler

d

eter

mina

tion

1. R

esou

rce

c

ollec

tion

Stage 7: Send results and display them

6. S

tatis

tics

c

ompu

tatio

n

Admin.

…Job …Task …Log data …Global statistics

Page 23: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

23

The DuDE Algorithm in Detail

Admin.

5. R

esto

re

lo

g da

ta

4. A

ssign

task

s

3. R

esou

rce

re

colle

ction

2. D

eter

mine

job

s

ched

uler

1. R

esou

rce

c

ollec

tion

Stage 7: Send results and display them

6. C

ompu

te

s

tatis

tics

Admin.

AN 1 AN 2 AN 3 AN 4 AN 50

10

20

30

40

50

60

70

80

90

Processor utilization of all ANsP

roc

es

so

r U

tiliz

ati

on

[%

]

…Job …Task …Log data …Global statistics

Page 24: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

24

Test Scenario and Evaluation

Admin.

2

3

4

5

67

1

Switch

PC Configuration

Pentium 4 (1.5 GHz)

512 MB RAM

Equivalent to AN HW

Page 25: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

25

Test Scenario and Evaluation

• Parameters:• Number of tasks inside job• Number of log data sets in the P2P network• Computational load for statistics computation

• Measurements:• Time for finishing a job• Memory utilization

Page 26: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

26

1 2 3 4 5 60

50

100

150

200

250

300

350

400

450

500

Finishing a Job (Load: 0 / Log Data Sets: 7)

Single ANDuDE

Number of Tasks

Tim

e [

s]

Test Scenario and Evaluation

Linear Increase of N

eeded Time

Time is Constant

Page 27: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

27

1 2 3 4 5 60

500

1000

1500

2000

2500

3000

3500

4000

Finishing a Job (Load: 10 / Log Data Sets: 7)

Single ANDuDE

Number of Tasks

Tim

e [

s]

Test Scenario and Evaluation

Linear Increase of N

eeded Time

Time is Constant

Page 28: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

28

1 2 3 4 5 60

10,000,000

20,000,000

30,000,000

40,000,000

50,000,000

60,000,000

70,000,000

Maximum Memory Utilization of AN (Load:10 / Log Data Sets:7)

Single ANJobschedulerØ Taskwatcher

Number of Tasks

Test Scenario and Evaluation

Linear Increase of M

emory Utilization

Constant Memory Utilization

Peter
log data sets
Page 29: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

29

1 3 70

10,000,000

20,000,000

30,000,000

40,000,000

50,000,000

60,000,000

70,000,000

Maximum Memory Utilization of ANs (Tasks:6 / Load:10)

Single ANJobschedulerØ Taskwatcher

Number of Log Data Sets

Test Scenario and Evaluation

Memory utilization increases more at the single AN than at the taskwatcher

Peter
number of log data sets
Peter
Memory utilization increases more at the single AN than at the taskwatchersEIN single AN, MEHRERE taskwatchers? (Plural, Einzahl?)
Page 30: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

30

0 1 100

10,000,000

20,000,000

30,000,000

40,000,000

50,000,000

60,000,000

70,000,000

Maximum Memory Utilization of ANs (Log Data Sets:7 / Tasks:6)

Single ANJobschedulerØ Taskwatcher

Computational Load

Test Scenario and Evaluation

Memory utilization is constant and independent of the computational load

Peter
Achsenbeschriftung: Computational Load:Du solltest vorher genau sagen, was das für ne Einheit hat --> nämlich keine, sondern Anzahl Schleifendurchläufe oder so
Peter
Log data sets
Page 31: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

31

Summary and Future work

• P2P-based system for distributed computing of statistics

• STS and LTS

• Statistics for a single AN and the whole network• Global Peer Data Discovery Algorithm

• Successfully developed prototype (demo session)

• Investigation of further use cases

Peter
Future Work: Hast dir da noch ein paar Sätze zurecht gelegt, die du sagst? Hat Prof. Timmermann ja nicht so nachvollziehen können.
Page 32: Of Rostock University DuDE: A D istributed Computing System u sing a D ecentralized P2P E nvironment The 4th International Workshop on Architectures, Services.

32

Thanks for your attention!

Questions?