Reverse Furthest Neighbors in Spatial Databases
description
Transcript of Reverse Furthest Neighbors in Spatial Databases
Reverse Furthest Neighbors in Spatial Databases
Bin Yao, Feifei Li, Piyush Kumar
Florida State University, USA
A Novel Query Type Reverse Furthest Neighbors (RFN)
Given a point q and a data set P, find the set of points in P that take q as their furthest neighbor
Two versions: Monochromatic Reverse Furthest Neighbors (MRFN) Bichromatic Reverse Furthest Neighbors (BRFN)
Motivation and Related works
Motivation: inspired by RNN Reverse Nearest Neighbor
Set of points taking query point as their NN.Monochromatic & Bichromatic RNN
Many applications that are behind the studies of the RNN have the corresponding “furthest” versions.
MRFN Application P: a set of sites of interest in a region For any site, it could find the sites that take itself
as their furthest neighbors This has an implication that visitors to the RFN of
a site are unlikely to visit this site because of the long distance.
Ideally, it should put more efforts in advertising itself in those sites.
BRFN Application P: a set of customers Q: a set of business competitors offering similar
products A distance measure reflecting the rating of
customer(p) to competitor(q)’s product. A larger distance indicates a lower preference. For any competitor in Q, an interesting query is to
discover the customers that dislike his product the most among all competing products in the market.
BRFN Example : customer : product
876531 ,,,,: of RFN pppppq
1p
2p
1q
4p
3p
6p
5p8p
2q
3q
7p
4213 ,,: of RFN pppq : of RFN 2q
MRFN and BRFN
MRFN for q and P:
BRFN for a point q in Q and P are:
q),fn(),,( QpPppPQqBRFN
q)}{,fn(),( qPpPppPqMRFN
Outline
MRFNProgressive Furthest Cell AlgorithmConvex Hull Furthest Cell AlgorithmDynamically updating to dataset
BRFN
MRFN: Progressive Furthest Cell Algorithm (first algorithm) Lemma: Any point from the furthest Voronoi cell(fvc) of p
takes p as its furthest neighbor among all points in P.
1p
3p2p
)( 1pfvc
5p4p
Progressive Furthest Cell Algorithm (PFC)PFC(Query q; R-tree T)
Initialize two empty vectors and ; priority queue L with T’s root node; fvc(q)=S;
While L is not empty do Pop the head entry e of L If e is a point then, update the fvc(q)
If fvc(q) is empty, return; If e is in fvc(q), then Push e into ;
else If e fvc(q) is empty then push e to ; Else for every child u of node e
If u fvc(q) is empty, insert u into ; Else insert u into L ;
CV PV
CV
PV
PV Update fvc(q) using points contained by entries in ; Filter points in using fvc(q);CV
PV
1p
3p2p
)( 1pfvc
4p
)( 1pfvc
Outline
MRFNProgressive Furthest Cell AlgorithmConvex Hull Furthest Cell AlgorithmDynamically updating to dataset
BRFN
MRFN: Convex Hull Furthest Cell Algorithm(second algorithm)
Lemma: the furthest point for p from P is always a vertex of the convex hull of P. (i.e., only vertices of CH have RFN.)
Find the convex hull of P; if , then return empty; else
Compute using ; Set fvc(q,P*) equal to fvc(q, ); Execute a range query using fvc(q,P*) on T;
PC
PCq
*PC }{qCP
*PC
CHFC(Query q; R-tree T (on P))
// compute only once
Outline
MRFNProgressive Furthest Cell AlgorithmConvex Hull Furthest Cell AlgorithmDynamically updating to dataset
BRFN
Dynamically updating to dataset
PFC: update R-tree CHFC:
update R-tree& re-compute CH (expensive)Qhull algorithm
Dynamically Maintaining CH: insertion
1p4p
3p2p
6p
5p
7p}{}{ 77 pCpP P
CC
Dynamically Maintaining CH: deletion
2p
8p
1p9p
3p
4p5p
6p
7p
The qhull algorithm
Dynamically Maintaining CH
2p
3p
2e
3e
1e
1p
minVdist
maxVdist
Adapt qhull to R-tree
Outline
MRFNProgressive Furthest Cell AlgorithmConvex Hull Furthest Cell AlgorithmDynamically updating to dataset
BRFN
BRFN
After resolving all the difficulties for the MRFN problem, solving the BRFN problem becomes almost immediate.
Observations: all points in P that are contained by fvc(q,Q) will have
q as their furthest neighbor. Only the vertexes of the convex hull have fvc.
BRFN algorithm
BRFN(Query q, Q; R-tree T) Compute the convex hull of Q; If then return empty; Else
Compute fvc(q, );Execute a range query using fvc(q, ) on T;
QC
QCq
QC
QC
BRFN: Disk-Resident Query Group
Limitation: query group size may not fit in memory
Solution: Approximate convex hull of Q (Dudley’s approximation)
Experiment Setup
Dataset: Real dataset (Map: USA, CA, SF)Synthetic dataset (UN, CB, R-Cluster)
MeasurementComputation time Number of IOsAverage of 1000 queries
MRFN algorithm
CPU computation Number of IOs
BRFN algorithms
CPU: vary A, Q=1000 IOs: vary A, Q=1000
Scalability of various algorithms
MRFN number of IOs BRFN number of IOs
Conclusion
Introduced a novel query (RFN) for spatial databases.
Presented R-tree based algorithms for both versions of RFN that feature excellent pruning capability.
Conducted a comprehensive experimental evaluation.
Thank you!Questions?
Datasets: San Francisco
Datasets: California
Datasets: North America
Datasets : uncorrelated uniform
Datasets : correlated bivariate
Datasets : random clusters