1 30 November 2006 An Efficient Nearest Neighbor (NN) Algorithm for Peer-to-Peer (P2P) Settings...
-
Upload
denis-patterson -
Category
Documents
-
view
215 -
download
0
Transcript of 1 30 November 2006 An Efficient Nearest Neighbor (NN) Algorithm for Peer-to-Peer (P2P) Settings...
![Page 1: 1 30 November 2006 An Efficient Nearest Neighbor (NN) Algorithm for Peer-to-Peer (P2P) Settings Ahmed Sabbir Arif Graduate Student, York University.](https://reader036.fdocuments.in/reader036/viewer/2022062519/5697c00d1a28abf838cc930d/html5/thumbnails/1.jpg)
130 November 2006
An Efficient Nearest Neighbor (NN) Algorithmfor Peer-to-Peer (P2P) Settings
Ahmed Sabbir ArifGraduate Student, York University
![Page 2: 1 30 November 2006 An Efficient Nearest Neighbor (NN) Algorithm for Peer-to-Peer (P2P) Settings Ahmed Sabbir Arif Graduate Student, York University.](https://reader036.fdocuments.in/reader036/viewer/2022062519/5697c00d1a28abf838cc930d/html5/thumbnails/2.jpg)
230 November 2006
Introduction
Though Peer-to-peer (P2P) networks are becoming increasingly popular as a powerful means for data exchange, methods for accessing complex data such as ‘spatial’ data on P2P networks are still at their infancy. In my presentation I will demonstrate a new and efficient algorithm for acquiring complex data off P2P Network.
Organization of this presentation: Nearest Neighbor (NN) Algorithm P2P Network How NN has been used in P2P data acquisition Problem with traditional strategy New algorithm Comparison
![Page 3: 1 30 November 2006 An Efficient Nearest Neighbor (NN) Algorithm for Peer-to-Peer (P2P) Settings Ahmed Sabbir Arif Graduate Student, York University.](https://reader036.fdocuments.in/reader036/viewer/2022062519/5697c00d1a28abf838cc930d/html5/thumbnails/3.jpg)
330 November 2006
An Heuristic Approach to Travelling Salesman Problem (TSP)
Nearest Neighbor (NN) was one of the first algorithms used to determine a solution to the Travelling Salesman Problem (TSP).
It is NP-hard
These are the steps of the algorithm:1. Pick an arbitrary starting vertex.2. Terminate if the current vertex has no unmarked edges.3. Find the unmarked edge from the current vertex with the
least weight and mark it.4. Select the vertex at the other end of the edge.5. Repeat from step 2.
5
4 2
163
![Page 4: 1 30 November 2006 An Efficient Nearest Neighbor (NN) Algorithm for Peer-to-Peer (P2P) Settings Ahmed Sabbir Arif Graduate Student, York University.](https://reader036.fdocuments.in/reader036/viewer/2022062519/5697c00d1a28abf838cc930d/html5/thumbnails/4.jpg)
430 November 2006
An Heuristic Approach to Travelling Salesman Problem (TSP) cont.
It doesn’t in general compute the optimal result but takes little time to execute.
Reliable only for special cases of the TSP where the triangle inequality is satisfied.
In the worst case, the algorithm can compute tours that are by an arbitrary factor larger than the optimal tour: For every constant r there is an instance of the TSP such that
the length of the tour computed by the NN algorithm >= (r * length of the optimal tour)
![Page 5: 1 30 November 2006 An Efficient Nearest Neighbor (NN) Algorithm for Peer-to-Peer (P2P) Settings Ahmed Sabbir Arif Graduate Student, York University.](https://reader036.fdocuments.in/reader036/viewer/2022062519/5697c00d1a28abf838cc930d/html5/thumbnails/5.jpg)
530 November 2006
Peer-2-Peer Network
ClientServer
ClientServer
ClientServer
![Page 6: 1 30 November 2006 An Efficient Nearest Neighbor (NN) Algorithm for Peer-to-Peer (P2P) Settings Ahmed Sabbir Arif Graduate Student, York University.](https://reader036.fdocuments.in/reader036/viewer/2022062519/5697c00d1a28abf838cc930d/html5/thumbnails/6.jpg)
630 November 2006
Quadtree Data Structure
A quadtree is a tree data structure in which each internal node has up to 4 children.
Used to partition a 2D space by recursively subdividing it into 4 quadrants/regions.
The regions may be square, rectangular or of arbitrary shapes.
![Page 7: 1 30 November 2006 An Efficient Nearest Neighbor (NN) Algorithm for Peer-to-Peer (P2P) Settings Ahmed Sabbir Arif Graduate Student, York University.](https://reader036.fdocuments.in/reader036/viewer/2022062519/5697c00d1a28abf838cc930d/html5/thumbnails/7.jpg)
730 November 2006
Distributed Hash Table (DHT) Chord
A class of decentralized distributed systems that partition ownership of a set of keys among participating nodes, and can efficiently route messages to the unique owner of any given key.
Can map peer address too.
Chord adapts efficiently as nodes join, leave.
There is a high probability that it will find a file in O(log n) time for n peers.
![Page 8: 1 30 November 2006 An Efficient Nearest Neighbor (NN) Algorithm for Peer-to-Peer (P2P) Settings Ahmed Sabbir Arif Graduate Student, York University.](https://reader036.fdocuments.in/reader036/viewer/2022062519/5697c00d1a28abf838cc930d/html5/thumbnails/8.jpg)
830 November 2006
Usual Query Strategy
Chord
IPKeys
Quad Tree
Data storageBucket 2
Data storageBucket 1
Data storageBucket 3
Data storageBucket 4
Control Point
Chord routing
algorithm
![Page 9: 1 30 November 2006 An Efficient Nearest Neighbor (NN) Algorithm for Peer-to-Peer (P2P) Settings Ahmed Sabbir Arif Graduate Student, York University.](https://reader036.fdocuments.in/reader036/viewer/2022062519/5697c00d1a28abf838cc930d/html5/thumbnails/9.jpg)
930 November 2006
Problems with Usual Strategies
Retrieving data is not obvious because in P2P there is no central server/administration.
Bad with special data. Can’t work in parallel (works sequentially). Approaches with a single root, that increases the possibility
of getting an non-optimal result. Single point of failure makes all tree operation start from
the peer who stores the root control point.
![Page 10: 1 30 November 2006 An Efficient Nearest Neighbor (NN) Algorithm for Peer-to-Peer (P2P) Settings Ahmed Sabbir Arif Graduate Student, York University.](https://reader036.fdocuments.in/reader036/viewer/2022062519/5697c00d1a28abf838cc930d/html5/thumbnails/10.jpg)
1030 November 2006
The Minimum Level ƒmin
Single point of failure makes all tree operation start from the peer who stores the root control point.
To avoid this ƒmin is introduced Forces objects to be stored (and query process to start) at a
levell ƒmin
ƒmax is also introduced to avoid objects being stored at l ƒmax
![Page 11: 1 30 November 2006 An Efficient Nearest Neighbor (NN) Algorithm for Peer-to-Peer (P2P) Settings Ahmed Sabbir Arif Graduate Student, York University.](https://reader036.fdocuments.in/reader036/viewer/2022062519/5697c00d1a28abf838cc930d/html5/thumbnails/11.jpg)
1130 November 2006
A Practical Approach to Parallelism
Have to avoid all-to-all communication. There remains a possibility where the spatial data structure
will return a closer object rather than the object that is more relevant, this has to be fixed.
1st iteration: Can be done:
By maintaining multiple entry point. Has a single peer’s point of view.
NN query a b .. .. ..
priority queue
Q-tree
initiates
![Page 12: 1 30 November 2006 An Efficient Nearest Neighbor (NN) Algorithm for Peer-to-Peer (P2P) Settings Ahmed Sabbir Arif Graduate Student, York University.](https://reader036.fdocuments.in/reader036/viewer/2022062519/5697c00d1a28abf838cc930d/html5/thumbnails/12.jpg)
1230 November 2006
Worst Case (WC) Criterion
After the 1st iteration we have: Many blocks (and the peers who maintain them) instead of single peer. Have chance to have more NN to be contacted.
2nd iteration decides which ones to be contacted using WC criterion Ensures that the relevant peers that can still help finding a closer
neighbor for the next nearest neighbor are contacted Works even for ƒmin >0
![Page 13: 1 30 November 2006 An Efficient Nearest Neighbor (NN) Algorithm for Peer-to-Peer (P2P) Settings Ahmed Sabbir Arif Graduate Student, York University.](https://reader036.fdocuments.in/reader036/viewer/2022062519/5697c00d1a28abf838cc930d/html5/thumbnails/13.jpg)
1330 November 2006
Worst Case (WC) Criterion cont.
Let's assume: Quire point: q The current nearest neighbor is at distance: m Position is the priority queue: t Element: e
The algorithm computes the maximum distance MaxDist(q, t) Processes all elements e in the priority queue whose distance
Dist(q,e) < Min(MaxDist(q, t), m)
Goal is to look at two pieces of information:1. The first spatial object in the priority queue that, in the worst case,
can be the next NN.2. The maximum possible distance from the query point to an object in
the top element of the priority queue.
![Page 14: 1 30 November 2006 An Efficient Nearest Neighbor (NN) Algorithm for Peer-to-Peer (P2P) Settings Ahmed Sabbir Arif Graduate Student, York University.](https://reader036.fdocuments.in/reader036/viewer/2022062519/5697c00d1a28abf838cc930d/html5/thumbnails/14.jpg)
1430 November 2006
NNQuery
AA AB BA BB
AC AD BC BD
CA CB DA DB
CC CD DC DD
XY
Z
Control point
Query point
WC-1
WC-2
![Page 15: 1 30 November 2006 An Efficient Nearest Neighbor (NN) Algorithm for Peer-to-Peer (P2P) Settings Ahmed Sabbir Arif Graduate Student, York University.](https://reader036.fdocuments.in/reader036/viewer/2022062519/5697c00d1a28abf838cc930d/html5/thumbnails/15.jpg)
1530 November 2006
Analysis of this New Algorithm
This new NN algorithm assumes: Every peer has perfect quadtree with height h. There is no way to determine if the peer added before still holds the
data. P2P network matches their standard.
Traditional P2P NN Algorithm Sequential, that’s why Complexity is: O(4h)
This new algorithm is parallel NN will be found in: O(h) time Increment rate would be: O(4h/2) Overall complexity: O(2h) It’s a big improvement considering P2P network
![Page 16: 1 30 November 2006 An Efficient Nearest Neighbor (NN) Algorithm for Peer-to-Peer (P2P) Settings Ahmed Sabbir Arif Graduate Student, York University.](https://reader036.fdocuments.in/reader036/viewer/2022062519/5697c00d1a28abf838cc930d/html5/thumbnails/16.jpg)
1630 November 2006
References
[1] Tanin, E., Nayar, D., and Samet, H. 2005. An efficient nearest neighbor algorithm for P2P settings, Proceedings of the 2005 National Conference on Digital Government Research, Atlanta, Georgia, May 15-18, 2005, pages 21-28.
[2] G. Kedem. The Quad-CIF Tree: A data structure for hierarchical online algorithms, Proceedings of the 19th Design Automation Conference, Las Vegas, NV, June 1982, pages 352-357.
[3] Wikipedia, Available: http://en.wikipedia.org/
![Page 17: 1 30 November 2006 An Efficient Nearest Neighbor (NN) Algorithm for Peer-to-Peer (P2P) Settings Ahmed Sabbir Arif Graduate Student, York University.](https://reader036.fdocuments.in/reader036/viewer/2022062519/5697c00d1a28abf838cc930d/html5/thumbnails/17.jpg)
1730 November 2006
Thank you!Questions
?