Fast Searching in Peer-to-Peer Networks
-
Upload
tyronica-ramirez -
Category
Documents
-
view
31 -
download
2
description
Transcript of Fast Searching in Peer-to-Peer Networks
Fast Searching in Peer-to-Peer Networks
Self-Organizing Parallel Search Clusters
Rocky Dunlap
Agenda
• Peer-to-peer Networks
• Search Links/Index Links Model
• Parallel Search Clusters
• Self-Organizing Parallel Search Clusters
• Further Research
Peer-to-Peer Networks
• Peer = Client + Server
• Anyone can send/process messages
• Highly Distributed
• Highly Parallel
• Data-centric routing
P2P Networks – Two Types
• Unstructured• “Loose” network
structure• Requires less control
of peers (casual searching)
• Fault tolerance, churn• Keyword searching
• Structured• Specific network
structure• Distributed Hash Tables
– Smart routing
• Guarantees:– Bounded hops– Bounded state– Ability to search entire
network
Unstructured Searching
?
The Problems
• Query saturation – every node processes every query
• Query processing redundancy
• Slow response time from distant nodes
• In reality, cannot search entire network (TTL)
• Need a model for studying P2P networks
SIL Model
• Search Links (forwarding)
• Index Links (non-forwarding)
SIL Model
?
SIL Model
?
SIL Model
?
SIL Model
?
SIL Model
?
SIL Model
?
SIL Model
?
SIL Model
?
SIL Model
?
SIL Model
?
SIL Model
?
SIL Model
?
Index links provide full coverage
Searches remain inside cluster
Parallel Search Clusters
Parallel Search Clusters
• Assumptions– Keep network essentially unstructured (keyword
searching, fault tolerance)– Search rate is high– Update rate is low
• Limit the number of nodes that processes query• Provide full (or high) coverage of network• Index links allow some nodes to proxy searches
for others
The Challenge
• Self-Organizing Parallel Search Clusters
• Decentralized• Nodes only know a few
neighbors• Dealing with “churn”• Minimal interruption of
normal operations
Proposed Solution
• Existing clusters split into two new clusters• Advantages
– Solves origin problem (start with one cluster)– Clusters split autonomously– Automatic load balancing
• Three phase approach– Color– Replicate Links– Split
Splitting Cluster
Phase 1Coloring
!
Splitting Cluster
Phase 1Coloring
!
Color (radius = 2)
Splitting Cluster
Phase 1ColoringColor (radius = 2)
Splitting Cluster
Phase 2Replicate Links
red
red
red
red
green
green
Splitting Cluster
Phase 2Replicate Links
red
red
red
red
Splitting Cluster
Phase 3Split
X
Splitting Cluster
Phase 3Split
X
Splitting Cluster
Phase 3Split
Splitting Cluster
Phase 3Split
X X
XX
XXX
X
Splitting Cluster
Phase 3Split
Splitting Cluster
Phase 3Split
Further Research
• Initiating the split• Choosing the radius for coloring phase
– Want two clusters of same size
• Overloading index links• Dealing with “churn”
– Nice nodes– Not-so-nice nodes
• Merge operation?• Simulation
Bibliography
• B. F. Cooper and H. Garcia-Molina. SIL: Modeling and Measuring Scalable Peer-to-peer Search Networks. http://www-db.stanford.edu/~cooperb/pubs/searchnets.pdf, 2003.
• B. Yang and H. Garcia-Molina. Improving Search in Peer-to-Peer Networks. http://dbpubs.stanford.edu:8090/pub/2002-28, 2002.