Peer-to-peer Networks : promise and trouble.
description
Transcript of Peer-to-peer Networks : promise and trouble.
1
UNIVERSITEITGENT
Peer-to-peer Networks :promise and trouble.
Bart DhoedtGhent University - Faculty of Applied SciencesDepartment of Information Technology (INTEC)
Presentation at NORDUnet Network ConferenceAugust 24-27, Reykjavik, 2003Tuesday, August 27, 2003.
e-mail : [email protected] : ++32 9 264 99 66
2
OUTLINE
1. Introduction
2. Taxonomy of P2P-systems
3. Issues in P2P-systems
4. P2P-trends
5. Concluding remarks
4
Defining P2P
• about sharing
• symmetric (architectural view)• creating an application-level overlay network• decentralized• application critical infrastructure owned by many
Ha
rdw
are
res
ou
rce s
So
ftw
are
res
ou
rces
disk space
bandwidth1001010
content
liability
computer cycles
5
Sharing resources ? - estimate of edge resources
- available for P2P-network
total number of Internet hosts : 150 Maverage disk capacity : 10 GBaverage available memory : 128 MBaverage processing power : 1 GFLOPSaverage BW : 100Kb/s
1% hosts50% processing power50% memory10% disk space 25% network bandwidth 1.5 Mprocessors
disk storage : 1.5 PBprocessing power : 1.5 PFLOPSBW/link : 25 Kb/s
6
Sharing resources ?
• What about supercomputers ?
12.3 TFLOPS8192 processors512 RS/6000 processing nodes6.2 TB memory storage160 TB disk storage110 M$106 tons
IBM ASCI White
1.5 PFLOPS1.5 M processors92 TB memory storage 1.5 PB disk storage? M$? tons
P2P-supercomputer
> x 10 !
7
P2P @ edge ?
• How to unleash the power of the “Internet’s dark matter ?”
8
[www.download.com]
P2P popularity
2003 summer download hit parade
1. Kazaa Media Desktop 2 644 777 261 405 2952. ICQ Lite 588 141 25 423 0643. AOL Instant Messenger (AIM) 532 897 17 521 1904. iMesh 392 703 55 145 2695. WinZip 351 865 100 741 7906. ICQ Pro 2003a beta 332 624 233 204 7127. Spybot – Search & Destroy 232 993 2 764 3808. Ad-aware 224 720 19 078 5559. Morpheus 179 347 114 140 26210. DownloadAccelerator Plus 119 601 36 355 895
P2PP2PP2PP2P
P2P
[Last week] [Total]
P2P
9
P2P popularity
Internet Applications Adoption Rate
0
10
20
30
40
50
60
70
1 3 5 7 9
11
13
15
17
19
21
23
Month
Millions
Hotmail ICQ Napster
Gnutella network : up to 400 000 nodes operating world wide
Napster : the early days …
10
Architectural view
Mediated P2P Pure P2P Hybrid P2P
NapsterAudiogalaxy
Early GnutellaFreeNet
GnutellaFastTrackKazaa
11
P2P-architectures
mediated pure hybrid
data traffic P2P P2P P2P
control traffic client-server P2P local : client-serverlong distance : P2P
efficiency + efficient search+ efficient control
- inefficient search- BW consuming +/-
scalability - control hot spot(mirrors needed ?)
- BW needed grows rapidly
good compromise
robustness - single point of failure- easy to attack
+ graceful degradation+ difficult to attack
?
accountability easy difficult difficult
13
P2P taxonomy
content sharing
distributed computing
instant messaging
collaborative working
mediated pure hybrid
14
File Sharing performance
150 M searches/day1.6 M downloads/day
10 TB data transfer/day 1-2 TB data transfer/day
100 servers 15000 servers
15
Distributed computing performance
10 tapes/week, 350 GB
10 000 0.3 MB work units
35 GB/tape16 hours recorded data
SETI=“Search for extraterrestrial Intelligence”
• started in 1998 as a 2 year project (but still running)• 4 M users signed up so far• Radio telescope data sent to clients for digital signal analysis• Nodes process data when cycles are available
(works as screen saver)• Using resources to allow better signal analysis
16
Distributed computing performance
computations per work unit 3.1x1012 FP-operationswork unit throughput 700 000/day
22x1017 FLOP/day
>25 TFLOPS
ASCI White@DoESETI@home
Processing 25 TFLOPS 12.3 TFLOPS
Cost 1 M USD 110 M USD
17
Scaling problems
Mechanisms in GNUTELLA to limit traffic• Network horizon set by TTL • Descriptor ID’s avoid cyclic routing• PONG/QueryHIT/Push NOT flooded
BUT ...
Bandwidth
0
2000
4000
6000
8000
10000
0 2 4 6 8
Horizon
KB
/PIN
G
“1 Gnutella request would cause 90MB data traffic on
Napster scale network”
18
Scaling answers1. Reduce network horizon to reduce f2. Use of reflectors
= node with high BW available- mimics peer sharing all files of its “clients”
3. Use of UltraPeers= same principle as reflector, but chosen dynamically
low access BW
high BW access
handles allPING/PONG
QUERY/QUERYHITTraffic
handle ONLY download traffic
19
Robustness• self-organization leads to power-law networks
(1% of servents shows server-like behaviour …)• very robust to random node failure• more vulnerable to targeted attacks
Simulation result for FreeNet peers
[T. Hong, “Performance”, Chapter 14 in “Peer-to-peer : Harnessing theBenefits of a Disruptive Technology”, ISBN 0-596-00110-X, O’Reilly,March 2001.]
20
Free-riding on Gnutella
Network size since Jan 2002
- only 30 % of nodes offering content- 50% of queries satisfied by 1% of servents
[www.limewire.com]
21
Overlay mismatch
Mismatch between application layernetwork and physical network
• 40% Gnutella clients belong to top 10% AS• only 2-5% links within AS
based on domain names
based on network traffic analysis
Gnutella’s clustering logic shows no/little correlation with domain name based clustering
[M. Ripeanu, A. Iamnichi, I. Foster, “Mapping the Gnutella Network”, IEEE Internet Computing, January-February 2002.]
22
Business Models
How to monetise P2P ?
• authors agree on “P2P business models are unclear”• reality : few companies make money on P2P• current situation : File sharing application sponsored by advertisement (banners)• some other possibilities
• micropayment mechanisms• indirect mechanisms
(P2P will increase BW-need and hence …) • tip based strategy (cf. US-model …)• make “low”-quality content available to get people interested in specific content• make use of end users devices to reduce cost !
23
Problems/issues/barriers/challenges
Problems Solutions
node/link transient naturerobustness
scalabilitybandwidth consumption
Network discontinuities(firewalls, (dynamic) NAT)
File-sharing : content redundancyCycle-sharing : checkpointing
Hybrid approachAvoid floodings (e.g. FreeNet : intelligent routing)Content/Query cachingTTLAvoid routing cycles
(Ab)use of port 80Rendez-vous servers
24
Problems/issues/barriers/challenges
Problems Solutions
application redesign
free-ridingaccountability
asymmetric bandwidth in access (ADSL, HFC)
inefficient overlay
P2P-frameworks
micro-payment
combine uplink capacity(e-donkey)
Network/infrastructureaware routing
Privacy/trustAnonymity
Encryption techniques(e.g. FreeNet : plausible deniability for node operators)
business models ? ???
25
P2P-trends
• emergence of platforms
• convergence between Grid-computing and P2P-technology
• enhance P2P-performance • semantic searches
(Tapestry, Content Addressable Networks …)• Query/result caching
26
Platform emergence
File sharing
Application areas
Distributedcomputing
InstantMessaging
Collaboration
• for 1 application area• non-generic
• 1 application class• 1 specific problem
• network interoperability ?
FreenetGnutella
GrooveSETI@home
eDonkey
• offer generic services• support the P2P paradigm• used to build P2P applications
? ?
? ??
?
DedicatedApplication Programs
and Protocols
PlatformsFrameworks
27
JXTA
• developed by Sun Microsystems• set of 6 XML based open protocols• Java API offered
Applications
Services
Core
Security
Peer Groups Peer Pipes Peer Monitoring
JXTA Community Services
Sun JXTAServices
PeerCommands
JXTAShell
Sun JXTAApplications
JXTA CommunityApplications
peer establishmentcommunication managementrouting
indexingsearchingfile sharing
e-mailauctioningdata storage
[http://www.jxta.org]
28
BOINC
• Berkeley Open Infrastructure for Network Computing • allows participants to participate to solve selected problems• = “generic SETI@Home”
[http://boinc.berkeley.edu]
29
Conclusions
For network operators
P2P applications can be very BW-consuming• extremely popular (and addictive)• use of inefficient strategies (broadcast, flooding, …)• “tragedy of the commons”
Danger for Bottlenecks • overlay network has little relation to physical infrastructure• symmetric relations between peers
Change in user behaviour • “always” online• information provider AND information consumer
30
Conclusions
For application developers
People are willing to share resources for free(and even want to spend money …)• make people feel they participate in a large project• give some credit to users (competition) (top 10 list, eternal fame if solution is found, …)
To avoid digging ones own grave • avoid BW-consuming strategies• include micropayment/trust mechanisms as
- encouragement to participate- avoid free-riding- avoid DoS attacks
People are (extremely) interested in digital content
31
Conclusions
For application developers
Hacker danger • need for encryption mechanisms
High performance P2P-platforms are emergent
• reuse of efforts• reuse of user community
Make sure your application has some scaling effect • the more users, the more interesting to join !