Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms...
-
Upload
piers-richard -
Category
Documents
-
view
214 -
download
1
Transcript of Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms...
![Page 1: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/1.jpg)
Routing Indices For P-to-P Systems
ICDCS 2002
![Page 2: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/2.jpg)
Introduction• Search in a P2P system
– Mechanisms without an index– Mechanisms with specialized index nodes (cent
ralized search)– Mechanisms with indices at each node
• Structure P2P network• Unstructure P2P network
• Parallel v.s. sequentially search– Response time– Network traffic
![Page 3: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/3.jpg)
Routing indices(RI)• Query
– Documents are on zero or more “topics”, and queries request documents on particular topics.
– Documents topics are independent
• Local index• RI
– Each node has a local routing index which contains following information
• The number of documents along each path• The number of documents on each topic of interest
– Allow a node to select the “best” neighbors to send a query to
![Page 4: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/4.jpg)
• The RI may be “coarser” than the local indices – overcounts– Undercounts
![Page 5: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/5.jpg)
• Goodness measure– Number of results in a path
• Using Routing indices
![Page 6: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/6.jpg)
– Storage space• N: number of nodes in the P2P network
• b: branching factor
• c: number of categories
• s: counter size in bytes
Centralized index : s*( c+1) *N
Distributed system: s*(c+1)*b (each node)
![Page 7: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/7.jpg)
• Creating routing indices
![Page 8: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/8.jpg)
• Maintaining Routing Indices– Trade off between RI freshness and update cost– No requiring the participation of a
disconnecting node
• Discussion– If the search topics is dependent?– Can the number of “hops” necessary to reach a
document be estimated?
![Page 9: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/9.jpg)
Alternative Routing Indices
• Hop-count RI– Aggregated RIs for each “hop” up to a maximu
m number of hops are stored
![Page 10: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/10.jpg)
– Search cost• Number of messages
– The goodness of a neighbor• The ratio between the number of documents availabl
e through that neighbor and the number of messages required to get those documents
– Regular tree with fanout F
– It takes Fh messages to find all documents at hop h
– Storage cost?
![Page 11: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/11.jpg)
• Exponentially aggregated RI– Store the result of applying the regular-tree cost
formula to a hop-count RI
– How to compute the goodness of a path for the query containing several topics?
![Page 12: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/12.jpg)
Cycles in the P2P network (HW)
![Page 13: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/13.jpg)
Improving Search in Peer-to-Peer Networks
ICDCS 2002
Beverly YangHector Garcia-Molina
![Page 14: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/14.jpg)
Outline
• Introduction
• Techniques
• Experiment
![Page 15: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/15.jpg)
Introduction
• We present three techniques for efficient search in P2P systems.– Basic idea is to reduce the number of nodes that
process a query
![Page 16: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/16.jpg)
Current Techniques
• Gnutella– BFS with depth limit D.– Waste bandwidth and processing resources
• Freenet– DFS with depth limit D.– Poor response time.
![Page 17: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/17.jpg)
Iterative Deepening
• Under policy P= { a, b, c} ;waiting time W
• See example.
![Page 18: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/18.jpg)
Directed BFS
• A source send query messages to just a subset of its neighbors
• A node maintains simple statistics on its neighbors– Number of results received from each neighbor– Latency of connection
![Page 19: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/19.jpg)
Candidate nodes
• Returned the Highest number of results
• Low hop-count
• High messages
![Page 20: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/20.jpg)
Local Indices
• Each node n maintains an index over the data of all nodes within r hops radius.
• All nodes at depths not listed in the policy simply forward the query.
• Example: policy P= { 1, 5}
![Page 21: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/21.jpg)
Experimental Setup
• For each response ,we log:– Number of hops took– IP from which the Response message came– Response time– Individual results
![Page 22: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/22.jpg)
Experimental result
![Page 23: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/23.jpg)
Efficient Content Location Using Interest-Based Locality in Peer-to-
Peer SystemsKunwadee Sripanidkulchai
Bruce Maggs
Hui Zhang
IEEE INFOCOM 2003
![Page 24: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/24.jpg)
motivation
• Although flooding is simple and robust, it is not scalable.
• A content location solution in which peers organized into an interest-based structure on top of Gnutella.
• The algorithm is called interest-based shortcuts
![Page 25: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/25.jpg)
Interest-based locality
![Page 26: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/26.jpg)
Shortcuts Architecture and Design Goals
• To create additional links on top of a peer-to-peer system’s overlay
• As a separate performance enhancement layer on top of existing content location mechanisms
![Page 27: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/27.jpg)
Content location paths
![Page 28: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/28.jpg)
Shortcut Discovery
• The first lookup returns a set of peers that store the content
• These are potential candidates.
• One peer is selected at random from the set and added
• For scalability, each peer allocates a fixed-size amount of storage to implement shortcuts.
![Page 29: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/29.jpg)
Shortcut selection
• We rank shortcuts based on their perceived utility
• A peer sequentially asking all of the shortcuts on its list.
![Page 30: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/30.jpg)
Ranking metrics
• Probability of providing content
• Latency of the path to the shortcut
• Load at the shortcut
• A combination of metrics can be used based on each peer’s preference
![Page 31: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/31.jpg)
Performance indices
• Success rate
• Load characteristics
• Query scope
• Minimum reply path lengths
• Additional state
![Page 32: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/32.jpg)
Potential and Limitations
• Adding 5 shortcuts at a time produces success rates that are close to the best possible.
• Slightly increase the shortest path length from 1 to 2 hops will perform better success rate.
![Page 33: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/33.jpg)
Conclusion
• A simple and practical mechanism was proposed.
![Page 34: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/34.jpg)
Similarity Discovery in structured P2P Overlays
ICPP
![Page 35: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/35.jpg)
Introduction• Structured P2P network
– Only support search with a single keyword
• Similarity between two documents– Keyword sets– Vector space– Measure
• Problems– Search problem– New keyword?
||||cos 1
ba
baab
![Page 36: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/36.jpg)
Meteorograph
• Absolute angle
![Page 37: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/37.jpg)
Publishing and Searching
• Publish– Hash
– Publish the item to a node np with the hash key closest to hash value
![Page 38: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/38.jpg)
• Search problem– Nearest answers– K_nearest answers–
• Partial
• Comprehensive
• Search strategy
• Discussions
• What happened when keyword vector is represented by ?
![Page 39: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/39.jpg)
Other issues
• Load balance
• Changes of vector space– Republished?– Comprehensive set of keywords– Other methods?
![Page 40: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/40.jpg)
SWAM: A Family of Access Methods for Similarity-Search in
Peer-to-Peer Data NetworksFarnoush Banaei-KashaniCyrus Shahabi
(CIKM04)
![Page 41: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/41.jpg)
PDN access method
• Defines
• How to organize the PDN topology to an index-like structure
• How to use the index structure
![Page 42: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/42.jpg)
Hilbert space
• Hilbert space (V, Lp)• Key k = (a1,a2, … , ad)
– d: the dimension of a Vector space– The domain is a contiguous and finite interval o
f R
• The Lp norm with p belongs to Z+– The distance function to measure the dissimilari
ty
![Page 43: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/43.jpg)
![Page 44: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/44.jpg)
Topology
• Topology of a PDN can be modelled as a directed graph G(N, E)
• A(n) is the set of neighbors for node n
• A node maintains– A limited amount of information about its neigh
bors Includes • the key of the tuples maintained at neighbors
• The physical addresses of neighbors
![Page 45: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/45.jpg)
• The processing of the query is completed when all expected tuples in the relevant result set are visited
• Access methods– Join, leave for virtual nodes– Forward for using local information to process
queries and make forwarding decisions
![Page 46: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/46.jpg)
The small world example
• Grid component
• Random graph component
• The process of queries (exact, range, kNN) in the highly locality topology
![Page 47: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/47.jpg)
![Page 48: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/48.jpg)
Flat partitioning
• SWAM also employs the space partitioning idea: flat partitioning
![Page 49: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/49.jpg)
Query Processing
• Exact-Match query processing
• Range query processing
• kNN Query processing
![Page 50: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/50.jpg)
Data Indexing in Peer-to-Peer DHT Networks
ICDCS 2004
![Page 51: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/51.jpg)
• Locating data using incomplete information.– How to search data in a DHT
• Data descriptors and queries– Semi-structured XML data
![Page 52: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/52.jpg)
– Query• Most specific query for d
• Relationship between queries
![Page 53: Routing Indices For P-to-P Systems ICDCS 2002. Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.](https://reader035.fdocuments.in/reader035/viewer/2022062719/56649ec75503460f94bd318a/html5/thumbnails/53.jpg)
• Given the most specific query, finding the location of the file is simple
• How about less specific queries
• Solution– Provide query-to-query service
• For a given query q, the index service returns a list of more specific queries, covered by q
– DHT storage system must be extended• Insert(q.qi), q->qi, adds a mapping (q;qi) to the index
of the node responsible for key q.