Post on 27-Dec-2015
1
ISP-aided Biased Query Search in P2P Systems
Vinay Aggarwal and Anja Feldmann
Vinay.Aggarwal@telekom.de
Deutsche Telekom Laboratories / TU Berlin Berlin, Germany
2
Introduction• P2P traffic >50% of Internet traffic
– Bittorrent, eDonkey, Skype, GoogleTalk…• P2P systems form overlays at application layer
– neighbour selection arbitrary– routing independent of Internet AS routing
• Routing layer functionality duplicated at application layer• P2P users want performance
– Measure topology themselves (use RTT) overhead– Build topology agnostic of underlay performance loss
• ISPs in a dilemma– P2P spurs broadband demand, still ISPs lose money– Traffic Engineering difficult with P2P traffic
• Lack of coordination Tension!
3
ISP-P2P tension
Random/RTT-based peer selection peerings cross ISP boundaries multiple times, often unnecessarily
4
Solution: ISP-P2P Cooperation
• Concept: ISP knows its network– Node: last-hop bandwidth, geographical location, service class– Routing policy, OSPF/BGP metrics, AS distance to other ISPs
• Each ISP offers Oracle service• P2P nodes query it during neighbour selection or file
exchange, send list of potential neighbours• Oracle ranks these by proximity
– Inside network, last-hop bandwidth, geographical location (city/PoP), AS hops
• ISP-aided optimal P2P neighbour selection• Simple and general solution, open for all overlays• Run as Web server or UDP service at known location
5
How Oracle works
6
Advantage for ISP/P2P
• Measurement overhead eliminated– Utilize knowledge of ISP
• Avoid high-latency paths and bottlenecks at inter-ISP transit/peering links
• ISPs regain control of network traffic
• Traffic across ISP boundaries reduced immense cost savings
• Better QoS to other applications, improved service to customers
7
Impact on network structure
• Node degree and mean overlay path length unchanged
• Graph remains connected, overlay & underlay diameter constant
• Large improvement in AS distance and intra-AS peerings
• Impact on flow conductance minimal• Densely connected subgraphs local to
ISPs– P2P topology correlated with AS topology
8
Overlay-Underlay Topology Correlation
Random vs biased Gnutella topology
9
Why Testlab?
• Real traffic instead of simulated flows
• Configure network devices (routers, switches, machines)– Generate variegated network scenarios and
traffic environments
• Wide range of experiments using real applications, network stacks, OS
• Better control & visibility vs Internet– No adverse effect on Internet traffic
10
Testlab used for experiments
11
Experimental Topologies
• Internet consists of Autonomous Systems (AS)– Prefix-based packet forwarding, based on AS policies
• P2P systems setup overlay topology– Implement own routing on query/key basis
• Design multiple-AS topology, each AS hosts multiple P2P users
• Router is an abstraction of AS boundary• 5 routers 5-AS topology
– Each router connects 3 machines, each machine runs 3 P2P applications concurrently
– 5 ASes, 15 machines, 45 P2P users
12
AS Topologies
RealisticTopology
Ring Topology
StarTopology
TreeTopology
13
Configuration of a topology
…using VLAN, VTP, ifconfig, route
14
Testlab topologies
15
P2P System: Gnutella
• Unstructured, open-source, popular file-sharing P2P system
• Each servent bootstraps by flooding Pings to known nodes, answered by Pongs
• Search content by flooding Query, answered by QueryHit (QH)
• Msgs carry TTL (max 7) and msg ID• Servent selects a node randomly from all QHs to
download desired content from– File exchange using HTTP, outside Gnutella
• Ultrapeers (UP) and leafs form 2-level hierarchy
16
Experimental Setup
• Each machine has 1 UP & 2 leafs, all run GTK-Gnutella
• Central machine runs oracle– Servents send list of IPs to oracle, which sorts them
according to parent AS and AS-hop distance, returns list to servent
• File-sharing schemes– Uniform: 6 unique files on all servents– Variable: UP-12, a leaf-6, other leaf-0 files
• Compare number of responses to Query• Each servent introduces a unique Query string
– Realistic query string distribution (mp3, album/artist)• Run unmodified and biased P2P experiments
17
Number of Query Messages
Topology
Uniform File Sharing
Variable File Sharing
Unmod. P2P
Biased P2P
Unmod. P2P
Biased P2P
Realistic 6604 2473 10194 4873
Ring 6623 2512 10939 4834
Star 6679 2533 10902 4863
Tree 6643 2468 10872 4847
=> 50% reduction with ISP-aided biased P2P neighbour selection
18
Query responses (Uniform FS)
=> Query responses with biased P2P (dotted) similar to unbiased P2P (bold)
19
Query Responses (Variable FS)
=> Effects similar across file distribution patterns and topologies
20
Query responses (rare queries)
=> Effects hold even for different query types
21
Large scale simulations
• SSFNet: discrete-event, packet-level simulator• 700 node P2P, 16 ASes, churn, free-riding
– behaviour as observed in real world
• Query traffic reduced by 54%• Swarming pattern of Queries benefits
– Reachability at remote locations improves
• Network discovery traffic reduced by 42%• Number of responses per Query similar
– Number of unsuccessful Queries same
22
Responses per Query
• Distribution of queries similar• Mean: 127 vs 102, Median: 78 vs 62
23
Conclusion & Future Work
• Unique and simple ISP-P2P collaboration concept, so that both benefit
• Scalability of P2P networks improves– Negotiation and query traffic reduced by 50%
• No adverse effects on query search process– Stable across popular, rare, unsuccessful Queries– Reachability of Queries at remote locations improves
• Advantages hold across topologies and scale• Planetlab experiments coming…