A Binary Linear Programming Formulation of the Graph Edit Distance
description
Transcript of A Binary Linear Programming Formulation of the Graph Edit Distance
A Binary Linear Programming Formulation of the Graph Edit Distance
Presented by Shihao Ji
Duke University Machine Learning Group
July 17, 2006
Authors: Derek Justice & Alfred Hero (PAMI 2006)
• Introduction to Graph Matching
• Proposed Method (binary linear program)
• Experimental Results (chemical graph matching)
Outline
Graph Matching
• Objective: matching a sample input graph to a database of known prototype graphs.
Graph Matching (cont’d)
• A real example: face identification
Graph Matching (cont’d)
Key issues: (1) representative graph generation
(a) facial graph representations
(b) chemical graphs
Maximum Common Subgraph (MCS)
Graph Edit Distance (GED) Enumeration procedures (for small graphs) Probabilistic models (MAP estimates) Binary Linear Programming (BLP)
Graph Matching (cont’d)
Key issues: (2) graph distance metrics
• Basic idea: define graph edit operations (such as insertion or deletion or relabeling of a vertex) along with costs associated with each operation.
• The GED between two graphs is the cost associated with the least costly series of edit operations needed to make the two graph isomorphic.
• Key issues: how to find the least costly series of edit operations? how to define edit costs?
Graph Edit Distance
Graph Edit Distance (cont’d)
• How to compute the distance between G0 and G1?
• Edit Grid
• Isomorphisms of G0 on the edit grid
• State Vectors
Graph Edit Distance (cont’d)
standard placement
• Definition: (if the cost function c is a metric)
• Objective function: binary linear program (NP-hard!!!)
Graph Edit Distance (Cont’d)
• Lower bound: linear program (polynomial time)
• Upper bound: assignment problem (polynomial time)
Graph Edit Distance (cont’d)
Edit Cost Selection
• Goal: suppose there is a set of prototype graphs {Gi} i=1,…,N
and we classify a sample graph G0 by a nearest neighbor classifier in the metric space defined by the graph edit distance.
• Prior informaiton: the prototypes should be roughly uniformly distributed in the metric space of graphs.
• Why: it minimizes the worst case classification error since it equalizes the probability of error under a nearest neighbor classifier.
Edit Cost Selection (cont’d)
• Objective: minimize the variance of pairwise NN distances• Define unit cost function, i.e., c(0,1)=1, c(,)=1, c(,)=0
• Solve the BLP (with unit cost) and find the NN pair
• Construct Hk,i = the number of ith edit operation for the kth NN pair
•
• Objective function: (convex optimization)
Experimental Results
• Chemical Graph Recognition
1. edge edit2. vertex deletion 3. vertex insertion 4. vertex relabeling5. random
(a) original graph
Experiments Results (cont’d)
(b) example perturbed graphs
Experiments Results (cont’d)
• Optimal Edit Costs
A: GEDo B: GEDu C: MCS1D: MCS2
Experiments Results (cont’d)
• Classification Results
• Present a binary linear programming formulation of the graph edit distance;
• Offer a minimum variance method for choosing a cost metric;
• Demonstrate the utility of the new method in the context of a chemical graph recognition.
Conclusion