Declustering the iTrust Search and Retrieval Network to Increase Trustworthiness
description
Transcript of Declustering the iTrust Search and Retrieval Network to Increase Trustworthiness
Declustering the iTrust Search and Retrieval Network to Increase Trustworthiness
Presentation by Christopher Badger
Research conducted in collaboration with Yung-Ting Chuang, Isai Michel Lombera, Louise E. Moser and P. M. Melliar-SmithUniversity of California, Santa Barbara
Supported in part by NSF Grant CNS 10-16103
Overview
Introduction
Design and Implementation of iTrust
Peer Neighborhoods
Declustering Algorithm
Clustering Coefficients
Results and Analysis
Expectation of Cooperation
Conclusions and Future Work
WEBIST 2012 iTrust Christopher Badger
IntroductionSearch engines such as Google and Yahoo! have played
an increasingly important role in today's world They offer fast and accurate search results … ideally
They are centralized, and therefore vulnerable to:
Attack Censorship
iTrust is our solution to this problem iTrust is a P2P network that functions by distributing metadata
about documents and search requests to random nodes in the iTrust membership
Designed to be resistant to censorship and attack
WEBIST 2012 iTrust Christopher Badger
Design of iTrust
Source ofInformation
WEBIST 2012 iTrust Christopher Badger
Source ofInformation
Requester ofInformation
RequestEncounters
Metadata
Design of iTrust
WEBIST 2012 iTrust Christopher Badger
Source ofInformation
Requester ofInformation
RequestMatched
Design of iTrust
WEBIST 2012 iTrust Christopher Badger
apachePHP
public interface
delete nodes
leave membership
query
search
inbox
statistics
user settings
tools
metadata inbox
tika / lucene / dictionary
metadata functionsmetadata xml engine
register metadata list
apply xml
publish xml list
helper functions
nodes wrapper
keywords wrapper
resource wrapper
tag keyword resource
search functions
globals / navigation
cURL
SQLite
session
log
PECL http
(a) (b) (c)
HTTP Implementation of iTrust
WEBIST 2012 iTrust Christopher Badger
Neighborhoods
We define the neighborhood of a node as: all of the other nodes to which the node is directly connected
Importance of a node's neighborhood The flow of information
Neighborhoods cannot be unlimited in size Too expensive to track the entire network
WEBIST 2012 iTrust Christopher Badger
Neighborhoods
Green nodes comprise peer A's neighborhood
A
WEBIST 2012 iTrust Christopher Badger
Neighborhoods
Beneficial to have only trustworthy nodes in one's neighborhood
How to determine which nodes are trustworthy? How to define trustworthiness?
Randomness Why is a random neighborhood useful? How to achieve neighborhood randomness?
Declustering Algorithm
WEBIST 2012 iTrust Christopher Badger
Declustering Algorithm
The process Ask all current neighbors for a list of their neighbors Create a master list containing all of these gathered lists Ensure the list contains only unique peers Drop all existing connections, effectively clearing the
neighborhood Select new neighbors randomly from the obtained list Can be done in a manner similar to the binomial distribution,
where each node has an equal chance to become a neighbor
WEBIST 2012 iTrust Christopher Badger
Declustering Algorithm
WEBIST 2012 iTrust Christopher Badger
What Declustering Does Randomizes each node's neighborhood
Reduces the clustering coefficient of the node performing declustering
The clustering coefficient is a measure of how cliquish the network is
Is performed locally by each node Does not require a global context Lowers the expectation of cooperation
WEBIST 2012 iTrust Christopher Badger
Metrics Local clustering coefficient is defined as:
To calculate the local clustering coefficient of node X, put all of X's neighbors into a set S
Find E, the number of possible edges between all nodes in S
• For an undirected graph, this number is: Find e, the number of edges that exist between nodes
in S The local clustering coefficient for X is given as:
Global clustering coefficient is defined as: The average of all of the local clustering coefficients
(S )×((S )−1)2
eE
the number of edges between a node's neighborsthe number of edges that could occur between them
WEBIST 2012 iTrust Christopher Badger
│S│x (│S│- 1)
Metrics Maximum degree of any node in the network
The number of connections of the most connected node in the network
Used as a reference for the prevalence of hubs in the network
Match probability The probability that an iTrust search in the particular graph results
in a hit
Network view The cardinality of the set containing all of a node's neighbors
and the neighbors of those neighbors
Average network view The average of all nodes' network views
WEBIST 2012 iTrust Christopher Badger
Results
Erdős–Rényi Graph Watts-Strogatz GraphBarabási–Albert Graph
WEBIST 2012 iTrust Christopher Badger
ResultsMaximum Hub Degree
Average Network View
Global Clustering Coefficient
Match Probability
Erdős–Rényi Graph
Initial1st Pass2nd Pass3rd Pass
192187187190
1000100010001000
0.15020.15010.15010.1499
0.92820.92830.92820.9279
Watts-Strogatz Graph
Initial1st Pass2nd Pass3rd Pass
150187185180
301100010001000
0.74500.15060.15030.1501
0.28580.92860.92830.9290
Barabási–Albert Graph
Initial1st Pass2nd Pass3rd Pass
492246187186
1000100010001000
0.23990.15330.15050.1508
0.96520.92970.92810.9283
WEBIST 2012 iTrust Christopher Badger
Results
WEBIST 2012 iTrust Christopher Badger
Results and Analysis Declustering reduced the clustering coefficient in both the
Watts-Strogatz and Barabási–Albert graphs
Declustering evened out the degree distribution in the network, acting to eliminate any hubs
For the Watts-Strogatz graph, the iTrust match probability greatly increased
Overall, declustering was able to effectively turn the Watts-Strogatz and Barabási–Albert graphs into random graphs similar to the Erdős–Rényi graph
By promoting network randomization, the minimum expectation of cooperation was decreased, thereby increasing robustness
WEBIST 2012 iTrust Christopher Badger
Expectation of Cooperation
Definition: Subjectively, the degree to which nodes act or rely on
information provided by other nodes Minimum Expectation of Cooperation
The minimum degree of cooperation expected from all nodes in order for the network to function well
Importance A lower minimum expectation of cooperation allows
nodes in the network to continue functioning well, despite increased resistance or attack by others
WEBIST 2012 iTrust Christopher Badger
Expectation Of Cooperation
WEBIST 2012 iTrust Christopher Badger
Conclusions
The declustering strategy increases iTrust's trustworthiness by randomizing peer neighborhoods
Declustering also decreases the global clustering coefficient of the network, which helps improve message forwarding performance
iTrust can be valuable for people who seek information on the Internet and are wary of potential censorship
WEBIST 2012 iTrust Christopher Badger
Future Work
We are looking into combining declustering and different message relaying strategies to increase network robustness
In addition to the HTTP implementation, we are also developing an SMS implementation for iTrust
We intend to release all iTrust source code and related documentation to the general public
WEBIST 2012 iTrust Christopher Badger
Questions?
Our iTrust Web Sitehttp://itrust.ece.ucsb.edu
Contact InformationChristopher Badger: [email protected] Chuang: [email protected] Michel Lombera: [email protected]
Our project is supported by NSF CNS 10-16193
WEBIST 2012 iTrust Christopher Badger