Using Sets of Feature Vectors for Similarity Search on Voxelized CAD Objects
-
Upload
tobias-stewart -
Category
Documents
-
view
22 -
download
0
description
Transcript of Using Sets of Feature Vectors for Similarity Search on Voxelized CAD Objects
San Diego, 06/12/03San Diego, 06/12/03Martin Pfeifle, Database Group, University of MunichMartin Pfeifle, Database Group, University of Munich
Using Sets of Feature Vectors for Using Sets of Feature Vectors for
Similarity Search on Voxelized CAD ObjectsSimilarity Search on Voxelized CAD Objects
Hans-Peter Kriegel,Stefan Brecheisen, Peer Kröger,Martin Pfeifle, Matthias Schubert
ACM SIGMOD 2003San Diego, CaliforniaJune 9-12, 2003
Database GroupInstitute for Computer ScienceUniversity of Munich, Germany
San Diego, 06/12/03San Diego, 06/12/03Martin Pfeifle, Database Group, University of MunichMartin Pfeifle, Database Group, University of Munich
Vector Set Modelnewnew
Outline of the TalkOutline of the Talk
Evaluation
Introduction
Space Partitioning Models
Data Partitioning Models
Conclusion
Introduction
Space Partitioning Models
Introduction
Evaluation
Conclusion
Vector Set Modelnewnew
Data Partitioning Models
San Diego, 06/12/03San Diego, 06/12/03Martin Pfeifle, Database Group, University of MunichMartin Pfeifle, Database Group, University of Munich
SSystem Requirementsystem Requirements::
System should help to reduce the cost of developing new partsSystem should help to reduce the cost of developing new parts Avoidance of „reinventing the wheel“Avoidance of „reinventing the wheel“ Reusing existing parts Reusing existing parts
IntroductionIntroduction
spatialobjects
complex
CAD-DB similarity query
timeout
unapt results
similarity query
similarity query
meaningful results in comparatevily short time
SSolutionolution::
Efficient Similarity SearchEfficient Similarity Search
Effective Similarity Search Effective Similarity Search Similarity Model based onSimilarity Model based onSets of Feature VectorsSets of Feature Vectors}}
San Diego, 06/12/03San Diego, 06/12/03Martin Pfeifle, Database Group, University of MunichMartin Pfeifle, Database Group, University of Munich
Outline of the TalkOutline of the Talk
Introduction
Space Partitioning Models
Data Partitioning Models
Evaluation
Conclusion
Space Partitioning Models
Introduction
San Diego, 06/12/03San Diego, 06/12/03Martin Pfeifle, Database Group, University of MunichMartin Pfeifle, Database Group, University of Munich
Voxelization of triangle meshesVoxelization of triangle meshes and object normalization and object normalization
normalized, voxelized object
Space Partitioning ModelsSpace Partitioning ModelsFeature TransformationFeature Transformation
0.75
CAD system
3D CAD object is represented by a mesh of triangles3D CAD object is represented by a mesh of triangles
triangle meshes
Partitioning of the data space into disjointPartitioning of the data space into disjoint, enumerated , enumerated cells cells
Extraction of Extraction of kk spatial features for each cel spatial features for each celll
0.34
.
.
.feature vector
Similarity of objects = vicinity of according feature vectorsSimilarity of objects = vicinity of according feature vectors
San Diego, 06/12/03San Diego, 06/12/03Martin Pfeifle, Database Group, University of MunichMartin Pfeifle, Database Group, University of Munich
Space Partitioning ModelsSpace Partitioning ModelsNotationNotation
r = 9
p = 3
CAD object
representing V o
[2D example]
The data space is partitioned into The data space is partitioned into pp axis-parallel grid axis-parallel grid cells in each dimensioncells in each dimension
Let Let r r = the raster (voxel) resolution= the raster (voxel) resolution
V V oo = set of voxels representing object = set of voxels representing object o o OO
VViioo = set of voxels covered by = set of voxels covered by oo in cell in cell ii
ffoo(i)(i) = = ii-th value of the feature vector of -th value of the feature vector of oo
San Diego, 06/12/03San Diego, 06/12/03Martin Pfeifle, Database Group, University of MunichMartin Pfeifle, Database Group, University of Munich
Space Partitioning ModelsSpace Partitioning ModelsThe Volume ModelThe Volume Model
4
[2D example]
Count the number of object voxels VViioo in each cell i
Normalize by the voxel capacity of each cell K
Feature value for cell i:
ffoo(i)(i) = = wherewhere K = K = in the 3D case in the 3D case K
Voi 3
pr
)(
1/9
66693660
San Diego, 06/12/03San Diego, 06/12/03Martin Pfeifle, Database Group, University of MunichMartin Pfeifle, Database Group, University of Munich
The solid angle model measures the concavity and convexity of surfacesThe solid angle model measures the concavity and convexity of surfaces
0.340.300.31 0.32
Compute the SA-value Compute the SA-value SASA((vv) for each surface-voxel ) for each surface-voxel vv of object of object oo:: SASA((vv)= )= , where is a voxelized , where is a voxelized reference spherereference sphere around around vv
|SvVo||Sv|
Sv
Space Partitioning ModelsSpace Partitioning ModelsThe Solid Angle ModelThe Solid Angle Model
[2D example]
0
1Sy y
Sx
x
Each cell is represented by one dimension in the feature vectorEach cell is represented by one dimension in the feature vector
fo(i) = 0 if cell i contains no voxel of o
fo(i) = 1 if cell i contains only inside voxel of o
mj=1m
1 fo(i) = SA(v)
San Diego, 06/12/03San Diego, 06/12/03Martin Pfeifle, Database Group, University of MunichMartin Pfeifle, Database Group, University of Munich
Outline of the TalkOutline of the Talk
Introduction
Space Partitioning Models
Data Partitioning Models
Evaluation
Conclusion
Space Partitioning Models
Data Partitioning Models
San Diego, 06/12/03San Diego, 06/12/03Martin Pfeifle, Database Group, University of MunichMartin Pfeifle, Database Group, University of Munich
Data Partitioning ModelsData Partitioning ModelsCover Sequence ModelCover Sequence Model
S2=((C0+C1)+ C2) Err2=107123
[2D example]
S1=(C0+C1) Err1=14
Cover-Sequence: Error: 2D feature vector fo:1167
fo4·i+1 = x-position of Ci
fo4·i+2 = y-position of Ci
fo4·i+3 = x-extension of Ci
fo4·i+4 = y-extension of Ci
Approximation of the object by means of a cover sequence Approximation of the object by means of a cover sequence (Jagadish 91)
6513
S3=((C0+C1)+ C2)-C3 ) Err3=7
Cover sequence: Cover sequence: SSk k = = ((((((CC0 0 11CC1 1 ) ) 22CC2 2 ) … ) … kkCCkk ), where ), where i i {+, -}, {+, -}, kk the the number of covers, and number of covers, and CCi i axis-parallel (hyper-) rectangles axis-parallel (hyper-) rectangles
Approximation quality: symmetric volume difference Approximation quality: symmetric volume difference ErrErrkk=|=|oo XOR XOR SSkk||
Computation of Computation of SSkk by means of by means of a a greedy algorithmgreedy algorithm
The object is represented by aThe object is represented by a 66··kk dimensional feature vector (3D case) dimensional feature vector (3D case)
San Diego, 06/12/03San Diego, 06/12/03Martin Pfeifle, Database Group, University of MunichMartin Pfeifle, Database Group, University of Munich
Data Partitioning ModelsData Partitioning ModelsVector Set ModelVector Set Model
S4query (original) = ((((C0 + C1) – C2) – C3) – C4)
S4database
S4query (optimal) = ((((C0 + C1) – C3) – C4) – C2)
S4database
dat
abas
e ob
ject
qu
ery
obje
ct
q1px
q1py
q1ex
q1ey
q2px
q2py
q2ex
q2ex
q3px
q3py
q3ex
q3ey
q4px
q4py
q4ex
q4ey
deuclid( ,
db1px
db1py
db1ex
db1ey
db2px
db2py
db2ex
db2ex
db3px
db3py
db3ex
db3ey
db4px
db4py
db4ex
db4ey
)
q1px
q1py
q1ex
q1ey
q3px
q3py
q3ex
q3ey
q4px
q4py
q4ex
q4ey
q2px
q2py
q2ex
q2ex
deuclid( ,
db1px
db1py
db1ex
db1ey
db2px
db2py
db2ex
db2ex
db3px
db3py
db3ex
db3ey
db4px
db4py
db4ex
db4ey
) >>>>
San Diego, 06/12/03San Diego, 06/12/03Martin Pfeifle, Database Group, University of MunichMartin Pfeifle, Database Group, University of Munich
Data Partitioning ModelsData Partitioning ModelsVector Set ModelVector Set Model
position X
position Y
extension Y
extension Xq1px
q1py
q1ex
q1ey
q2px
q2py
q2ex
q2ex
q3px
q3py
q3ex
q3ey
q4px
q4py
q4ex
q4ey
q1px
q1py
q1ex
q1eyq2px
q2py
q2ex
q2exq3px
q3py
q3ex
q3ey q4px
q4py
q4ex
q4ey
db1px
db1py
db1ex
db1ey
db2px
db2py
db2ex
db2ex
db3px
db3py
db3ex
db3ey
db4px
db4py
db4ex
db4ey
db1px
db1py
db1ex
db1eydb2px
db2py
db2ex
db2exdb3px
db3py
db3ex
db3eydb4px
db4py
db4ex
db4ey
the cover sequence the cover sequence SSk k = = ((((((CC0 0 11CC1 1 ) ) 22CC2 2 ) … ) … kkCCkk ) is represented ) is represented
by a set of vectorsby a set of vectors XX 66, , | | XX | | k k (in the 3D case)(in the 3D case)
[2D example]
query object database object
distance measure between two vector sets distance measure between two vector sets X X and and YY: :
perfect matchingperfect matching
create a create a complete bipartite graph complete bipartite graph G G = = ((XXY, XY, XYY))
weight function for unmatched nodes if |X| weight function for unmatched nodes if |X| |Y||Y|
weight of each edge (x, y) weight of each edge (x, y) XXY is dY is deuclideuclid((x,yx,y)) computed by the Kuhn Munkres algorithm in computed by the Kuhn Munkres algorithm in OO((kk33))
the minimum weightthe minimum weight
position X
position Y
extension Y
extension X
weight function for unmatched nodes=distance to a dummy cover
San Diego, 06/12/03San Diego, 06/12/03Martin Pfeifle, Database Group, University of MunichMartin Pfeifle, Database Group, University of Munich
Data Partitioning ModelsData Partitioning ModelsVector Set ModelVector Set Model
Efficient similarity queries based on multi-step query processingEfficient similarity queries based on multi-step query processing
range queries range queries (Faloutsos et al. 94)kk-Nearest Neighbor Queries -Nearest Neighbor Queries (Korn et al. 96) optimal Multi-Step optimal Multi-Step kk-Nearest Neighbor-Nearest Neighbor Search Search (Seidl, Kriegel 98)
Filter Step(index-based)
Refinement Step(exact evaluation)
candidates
results
k k (=cardinality of the two vector sets) times the distance between the centroides of (=cardinality of the two vector sets) times the distance between the centroides of the two vector sets, lower bounds the minimum weight perfect matching distancethe two vector sets, lower bounds the minimum weight perfect matching distance
query object database object
position X
position Y
extension Y
extension X
lower bounding property guarantees no false dropslower bounding property guarantees no false drops oo11, , oo2 2 O O : : ddoo((oo11, , oo22)) ddff((oo11, , oo22) )
query centroid database centroid
San Diego, 06/12/03San Diego, 06/12/03Martin Pfeifle, Database Group, University of MunichMartin Pfeifle, Database Group, University of Munich
Outline of the TalkOutline of the Talk
Introduction
Space Partitioning Models
Data Partitioning Models
Evaluation
Conclusion
Data Partitioning Models
Evaluation
San Diego, 06/12/03San Diego, 06/12/03Martin Pfeifle, Database Group, University of MunichMartin Pfeifle, Database Group, University of Munich
EvaluationEvaluation
Evaluation of similarity models by means of Evaluation of similarity models by means of kk-nn queries-nn queries
report the report the kk objects having the smallest distance to a query object objects having the smallest distance to a query object qq
distance: 0.0 0.0 0.368 0.368 0.666
distance: 0.0 0.0098 0.307 0.416 0.46 0,0220,01780,01760,00,0distance:
distance: 0,0 0,04 0,04 0,07 0,12
volume model:
solid angle model:
„good“ similarity model? „bad“ similarity model?
volume model:
solid angle model:
Problem: Problem: • • evaluation using k-nn queries is subjectiveevaluation using k-nn queries is subjective
• • quality measure of a model depends on quality measure of a model depends on the choice of the query objectsthe choice of the query objects
KK-nn Queries -nn Queries
San Diego, 06/12/03San Diego, 06/12/03Martin Pfeifle, Database Group, University of MunichMartin Pfeifle, Database Group, University of Munich
EvaluationEvaluation
Hierarchical Clustering:Hierarchical Clustering:
More objective since each object of the database More objective since each object of the database is taken into account to measure the quality of a similarity modelis taken into account to measure the quality of a similarity model
OPTICS OPTICS (Kriegel et al. 99)
• Yields a density-based hierarchical clusteringYields a density-based hierarchical clustering
• Insensitive to input parametersInsensitive to input parameters
• Result (so called Result (so called reachability plotreachability plot) can be easily visualized ) can be easily visualized
and is suitable for interactive explorationand is suitable for interactive exploration
A1
A2
2
A1 A2 BB
A BA
B
1
Data Space Reachability Plot
Hierarchical Clustering Hierarchical Clustering
San Diego, 06/12/03San Diego, 06/12/03Martin Pfeifle, Database Group, University of MunichMartin Pfeifle, Database Group, University of Munich
EvaluationEvaluation
Volume Model
Class A
AB
C
no classes found
Class B
Class C
Solid Angle Model
Car Datasetapp. 200 parts, r=30, p=3
Space Partitioning Similarity Models Space Partitioning Similarity Models
San Diego, 06/12/03San Diego, 06/12/03Martin Pfeifle, Database Group, University of MunichMartin Pfeifle, Database Group, University of Munich
EvaluationEvaluation
Class E
Class X
Class G
XA
C E
GCover Sequence Model
Vector Set Model
Class E
Class G2
Class G1
Class F
Class A2
Class A1
A1
A2
B C DE F G1
G2A
G
Car Datasetapp. 200 parts, r=15, 7 covers
Data Partitioning Similarity Models Data Partitioning Similarity Models
San Diego, 06/12/03San Diego, 06/12/03Martin Pfeifle, Database Group, University of MunichMartin Pfeifle, Database Group, University of Munich
EvaluationEvaluation
Efficiency Evaluation:Efficiency Evaluation:
100 10-nn-queries on the plane database,100 10-nn-queries on the plane database, cover sequence with 7 covers cover sequence with 7 covers
CPU time
[sec]
I/O time
[sec]
total runtime
[sec]
vector set without filter 1025.32 806.40 1831.72
vector set with filter
(X-tree)
105.88 932.80 1038.68
cover sequence
(X-tree)
142.82 2632.06 2774.88
vector set model <-> cover sequence model vector set model <-> cover sequence model
vector set model outperforms cover sequence modelvector set model outperforms cover sequence model
Efficiency of the Vector Set ModelEfficiency of the Vector Set Model
vector set model without filter <-> vector set model with filtervector set model without filter <-> vector set model with filter
Filter step leads to a speed up factor of approximately 2Filter step leads to a speed up factor of approximately 2
Filter step has a selectivity of approximately 20%Filter step has a selectivity of approximately 20%
San Diego, 06/12/03San Diego, 06/12/03Martin Pfeifle, Database Group, University of MunichMartin Pfeifle, Database Group, University of Munich
Outline of the TalkOutline of the Talk
Introduction
Space Partitioning Models
Data Partitioning Models
Evaluation
Conclusion
Evaluation
Conclusion
San Diego, 06/12/03San Diego, 06/12/03Martin Pfeifle, Database Group, University of MunichMartin Pfeifle, Database Group, University of Munich
ConclusionConclusion
Contribution:Contribution: Sets of feature vectors : Sets of feature vectors : a new way of representing objects in similarity searcha new way of representing objects in similarity search somewhere between feature vectors and graphssomewhere between feature vectors and graphs
Effective and efficient similarity model for CAD data Effective and efficient similarity model for CAD data based on sets of feature vectorsbased on sets of feature vectors
Evaluation of similarity models based on hierarchical clustering Evaluation of similarity models based on hierarchical clustering
position X
position Y
extension Y
extension Xq1px
q1py
q1ex
q1ey
q2px
q2py
q2ex
q2ex
q3px
q3py
q3ex
q3ey
q4px
q4py
q4ex
q4ey
q1px
q1py
q1ex
q1eyq2px
q2py
q2ex
q2exq3px
q3py
q3ex
q3ey q4px
q4py
q4ex
q4ey
db1px
db1py
db1ex
db1ey
db2px
db2py
db2ex
db2ex
db3px
db3py
db3ex
db3ey
db4px
db4py
db4ex
db4ey
db1px
db1py
db1ex
db1eydb2px
db2py
db2ex
db2exdb3px
db3py
db3ex
db3eydb4px
db4py
db4ex
db4ey
San Diego, 06/12/03San Diego, 06/12/03Martin Pfeifle, Database Group, University of MunichMartin Pfeifle, Database Group, University of Munich
ConclusionConclusion
Future Work:Future Work: BOSS (BOSS (BBrowsing rowsing OOPTICS-Plots for PTICS-Plots for SSimilarity imilarity SSearch)earch)
Interactive data browsing tool based on reachability plotsInteractive data browsing tool based on reachability plots User-friendly method to support the time-consuming task User-friendly method to support the time-consuming task of finding similar parts:of finding similar parts:
• Revealing the hierarchical clustering structure Revealing the hierarchical clustering structure
of the dataset at a glanceof the dataset at a glance
• Displaying suitable representatives for large clustersDisplaying suitable representatives for large clusters
San Diego, 06/12/03San Diego, 06/12/03Martin Pfeifle, Database Group, University of MunichMartin Pfeifle, Database Group, University of Munich
Thank you for your attention
Any questions?
??
?
??
?
?
?