Fast Searching in Peer-to-Peer Networks

Fast Searching in Peer-to-Peer Networks

Self-Organizing Parallel Search Clusters

Rocky Dunlap

Agenda

• Peer-to-peer Networks

• Search Links/Index Links Model

• Parallel Search Clusters

• Self-Organizing Parallel Search Clusters

• Further Research

Peer-to-Peer Networks

• Peer = Client + Server

• Anyone can send/process messages

• Highly Distributed

• Highly Parallel

• Data-centric routing

P2P Networks – Two Types

• Unstructured• “Loose” network

structure• Requires less control

of peers (casual searching)

• Fault tolerance, churn• Keyword searching

• Structured• Specific network

structure• Distributed Hash Tables

– Smart routing

• Guarantees:– Bounded hops– Bounded state– Ability to search entire

network

Unstructured Searching

?

The Problems

• Query saturation – every node processes every query

• Query processing redundancy

• Slow response time from distant nodes

• In reality, cannot search entire network (TTL)

• Need a model for studying P2P networks

SIL Model

• Search Links (forwarding)

• Index Links (non-forwarding)

SIL Model

?

Index links provide full coverage

Searches remain inside cluster

Parallel Search Clusters

Parallel Search Clusters

• Assumptions– Keep network essentially unstructured (keyword

searching, fault tolerance)– Search rate is high– Update rate is low

• Limit the number of nodes that processes query• Provide full (or high) coverage of network• Index links allow some nodes to proxy searches

for others

The Challenge

• Self-Organizing Parallel Search Clusters

• Decentralized• Nodes only know a few

neighbors• Dealing with “churn”• Minimal interruption of

normal operations

Proposed Solution

• Existing clusters split into two new clusters• Advantages

– Solves origin problem (start with one cluster)– Clusters split autonomously– Automatic load balancing

• Three phase approach– Color– Replicate Links– Split

Splitting Cluster

Phase 1Coloring

!

Splitting Cluster

Phase 1Coloring

!

Color (radius = 2)

Splitting Cluster

Phase 1ColoringColor (radius = 2)

Splitting Cluster

Phase 2Replicate Links

red

red

red

red

green

green

Splitting Cluster

Phase 2Replicate Links

red

red

red

red

Splitting Cluster

Phase 3Split

X

Splitting Cluster

Phase 3Split

Splitting Cluster

Phase 3Split

X X

XX

XXX

X

Splitting Cluster

Phase 3Split

Further Research

• Initiating the split• Choosing the radius for coloring phase

– Want two clusters of same size

• Overloading index links• Dealing with “churn”

– Nice nodes– Not-so-nice nodes

• Merge operation?• Simulation

Bibliography

• B. F. Cooper and H. Garcia-Molina. SIL: Modeling and Measuring Scalable Peer-to-peer Search Networks. http://www-db.stanford.edu/~cooperb/pubs/searchnets.pdf, 2003.

• B. Yang and H. Garcia-Molina. Improving Search in Peer-to-Peer Networks. http://dbpubs.stanford.edu:8090/pub/2002-28, 2002.

http://www-db.stanford.edu/~cooperb/pubs/searchnets.pdf

http://www-db.stanford.edu/~cooperb/pubs/searchnets.pdf

http://dbpubs.stanford.edu:8090/pub/2002-28

Fast Searching in Peer-to-Peer Networks

Documents

Transcript of Fast Searching in Peer-to-Peer Networks