Experience with an Object Reputation System for Peer-to-Peer File Sharing NSDI’06(3th USENIX...
-
Upload
rebecca-owen -
Category
Documents
-
view
225 -
download
0
Transcript of Experience with an Object Reputation System for Peer-to-Peer File Sharing NSDI’06(3th USENIX...
Experience with an ObjectReputation System for Peer-to-Peer
File Sharing
NSDI’06(3th USENIX Symposium on Networked Systems Design & Implementation)
Kevin Walsh Emin Gun SirerCornell University
Presenter: Elaine
2
Problem
• A P2P filesharing application with search capability (e.g. Gnutella)
• Filesharing apps use meta-data for searching• Meta-data like file name, file size, file descriptors, content-hash, etc
• Problem– Users blindly believe the meta-data– Object authenticity (or Reputaiton)
• downloaded file == what it claims to be
• Current peer-to-peer filesharing networks, which are rife with corrupt and mislabeled content
• Much of this pollution can be attributed to deliberate attacks [14]
3
Recent approaches
• Past experience– Small portion of peers
• # of shared files as an endorsement– Large number of malicious peers can share the same
files– Angry users may share
• Voting– Trust on voters?– No incentive to vote
• Call for a trustworthiness metrics– Credence
4
Credence
• How to decide the reputation of an object– Use voting and deal with the trust problem
• How? – Compare voting history of two peers– Trust peers with identical votes more
• Correlation Computation
– If they don’t share enough history, build a trust relationship graph and trust multi-hop peers (transitive trust)
5
Computing correlation
• Calculate each two peers’ trust relationship– A standard formula for the
coefficient of correlation between binary data sets Phi coefficient
• Theta takes a range of [-1,1]• Positive values indicate
agreement • Negative values indicate
disagreement
A- A+
B- 3 3 6
B+ 5 2 7
8 5
5*8*7*6
3*5-2*3
6
Transitive trust (ref. from K Walsh, EG Sirer)
• Voting history (1 == correct, 0 == incorrect)
Obj 3 1
Obj 4 1
Obj 5 0
Obj 6 0
A B C
Obj 5 0
Obj 6 0
Obj 7 1
Obj 8 1
Obj 1 1
Obj 2 0
Obj 3 1
Obj 4 1
Local Trust Local Trust
Transitive Trust
θac = θab * θbc
7
Voting on Object
• A Vote is a signed tuple: <H: S,T> K
– H - File content hash– S – Statement about the file
• Thumb up ( unconditionally thumb up)• Thumb down ( unconditionally thumb up)
– T – Timestamp– K – Certificate
8
• Three basic operations in Credence
• Voting– A peer casts a vote on a object after each downloadin
g and store it locally to the vote database
Algorithm
Voting Issuing a vote-gather query Evaluating the object reputation
• Sender (Issuing a vote-gather query)– Issuing a vote-gather query, specifying the hash of
content (a given object), to the underlying Gnutella network, store the gathering votes to the vote database.
• Receiver (After receiving a vote-gather query)– Send back their own matching votes and any matching
votes they have seen recently with the most reliable weight
– Advantages:• Bound the overall cost of votes collection• Voters are not required to remain online
Voting Issuing a vote-gather query Evaluating the object reputation
• Votes that apply positively are given an initial value of +1, and those that apply negatively −1
• Look up the trust relationship from correlation table
• Calculate the weighted average of votes using correlation values to derive the object reputation scores
Voting Issuing a vote-gather query Evaluating the object reputation
11
Evaluation Overview
• 10,000 downloads since March 2005
• 2 crawlers collected 200 daily snapshots of the network structure
• Dataset– Data compiled from about 1,200 Credence
clients – 39,000 votes and 84,000 files
• Presents the correlation values between any pair of peers with overlapping vote histories
• On average, each node is directly correlated with 27 other peers.• Four groups of peers
Correlation between Credence peers
• 35% of altruistic users, 50% of non-participants, and 15% of attackers
• Attackers may not have malicious intention
• The votes from attackers actually provide a tangible benefit to the system
• The file authenticity is a fairly universal concept among filesharing users
Credence users Classification
Local and Transitive Relationships
% of peers with valid correlation
values
Not many high-quality correlations!!!
Different correlation strength and size of usable votes set
• Consistency– The number of pairs of votes in agreement divided by the
number of pairs in agreement or conflict.
size of usable
set
Consistancy of
usable vote set
Vote classification
• Most peer discover their first peer correlation after casting fewer than 18 votes
Coordinated attackers cast a lot of votes!!
# of votes cast
17
Files in Credence
• Data set– 681 Credence clients. These users advertise
a total of 84,838 files, of which 67,794 are unique
File distribution(Decoys)
• By number of times shared
• By number of hosts
Two types of attacks
File Voting Popularity
• Voting data set comprises 39,761 votes cast on 35,690 unique files. • Positive votes are spread evenly• Negative votes a more skewed distribution
• Sharing and voting behavior largely independent• Voting Can Contradict Sharing
Voting and Sharing
21
End-to-End Performance
• Load generator to repeatedly query the Gnutella network for typical keywords over a 24 hour period, and logged the search results returned (Sortign the file by # of peers sharing it ).
Resistance to Collusion
• Pick peers from main cluster• Large scale attack are more likely to be detected.• Detect 75% decoys
Ranking Performance
24
Credence Overhead
• Inbound traffic: A highly active client receives 100 bytes per second of additional background traffic in Credence.
• Outbound traffic: depends on popularity of client’s votes, client’s reputation and Gnutella connectivity
• Processing overhead < 1% of 1.7 GHz
25
Conclusion
• The fisrt distributed p2p object reputation system to identify pollutions
• Provide incentives for users to participate honestly in voting
• Not specific to Gnutella network
26
My comment
• Pros– Incentive seems robust
• Cons– Performance verification is weak– No comparison with other mechanism– Still need a centralized certificate authority– Storing votes waste space (need to maintain vote dat
a base, trust graph, correlation table)– People are lazy (Emule way, but can not avoid large
attacks)
27
• The design of Credence is guided by several goals that are necessary requirements for a successful peer-to-peer reputation mechanism– Relevance– Distribution and Decentralization– Robustness– Isolation– Motivation
• To participate honestly in the reputation system