Multi-Label Feature Selection for Graph Classification

Semi-Supervised Feature Selection for Graph Classification

Xiangnan Kong,Philip S. YuMulti-Label Feature Selection for Graph ClassificationDepartment of Computer ScienceUniversity of Illinois at Chicago

1OutlineIntroductionMulti-Label Feature Selection for Graph ClassificationExperimentsConclusion#Introduction: Graph Data

Program FlowsXML DocsChemical Compounds In real apps, data are not directly represented as feature vectors, but graphs with complex structures. E.g. G(V, E, l) - y Conventional data mining and machine learning approaches assume data are represented as feature vectors. E.g. (x1, x2, , xd) - y#3Introduction: Graph Classification

+

-?Training GraphsTesting GraphGraph Classification:Construct a classification model for graph data

Example: drug activity predictionGiven a set of chemical compounds labeled with activities to one type of disease or virus Predict active / inactive for a testing compound

#4Feature VectorsHHNx1x2Graph Classification using Subgraph FeaturesHHHOCHHHHHHHH100111Subgraph PatternsHG1G2g1g2g3HHNHHNOOCCCCOOCCCCCCCCCCCCCCCCCCCCCCCCClassifierx1x2Graph ObjectsFeature VectorsClassifiersHow to find a set of subgraph features in order to effectively perform graph classification?#5Existing Methods for Subgraph Feature Selection

Graphs+-Useful SubgraphsHHNCCCCCCOOCCFeature Selection for Graph ClassificationFind a set of useful subgraph features for classification

Existing Methods Select discriminative subgraph featuresFocused on single-label settingsAssume one graph can only have one label

+ Lung CancerGraphLabel#6Multi-Label GraphsIn many real apps, one graph can have multiple labels.- Lung Cancer

+ Melanoma+ Breast CancerGraphLabelsAnti-Cancer Drug Prediction#7Multi-Label GraphsXML Document Classification (One document -> multiple tags)Program flow error detection (One program -> multiple types of errors)Kinase Inhibitor Discovery(One chemical -> multiple types of kinase)

Other Applications:

#8xabcMulti-Label Feature Selection for Graph Classification

Subgraph featuresF(p)Multi-labelClassificationEvaluationCriteriaMulti-Label GraphsFind useful subgraph features for graphs with multiple labels acabcb

#9Two Key Questions to AddressEvaluation: How to evaluate a set of subgraph features using multiple labels of the graphs? (effective)

Search Space Pruning: How to prune the subgraph search space using multiple labels of the graphs? (efficient)#What is a good feature?Dependence Maximization Maximize dependence between the features and the multiple labels of graphsAssumption Graphs with similar label sets should have similar features.

12acabcdedeab

f

df#So we think good features should have 3 properties11Dependence MeasureHilbert-Schmidt Independence Criterion (HSIC) [Gretton et al. 05]Evaluates the dependence between input feature and label vectors in kernel space.Empirical Estimate is easy to calculate

KS : kernel matrix for graphsKS [i, j] : measures the similarity between graph i and j on the common subgraph features they contain (in S)

L : kernel matrix for label vectorsL [i, j] : measures the similarity between label sets of graph i and graph j

H = I 11T/n : centering matrix

HSIC =

ac

abcusing common subgraphfeatures in Susing label vectors in {0,1}Q#

gHSIC ScoreOptimization -> gHSIC CriteriongHSIC Score:

represents the i-th subgraph featureObjective: Maximize Dependence (HSIC)NHHgoodCCCCbad

(the sum over all selected features)#13Two Key Questions to AddressHow to evaluate a set of subgraph features with multiple labels of the graphs? (effective)

How to prune the subgraph search space using multiple labels of the graphs? (efficient)#Finding a Needle in a HaystackPattern Search Tree0-edges1-edge2-edges gSpan [Yan et. al ICDM02]An efficient algorithm to enumerate all frequent subgraph patterns (frequency min_support)

not frequentToo many frequent subgraph patterns

Find the most useful one(s) using multiple labelsHow to find the Best node(s) in this treewithout searching all the nodes? (Branch and Bound to prune the search space)#gHSIC Upper BoundgHSIC:represents the i-th subgraph feature

An Upper-Bound of gHSIC:

gHSIC-UB =

Upper-Bound of gHSIC scores for all supergraphs of the Anti-monotonic with subgraph frequency

----> Pruning

#Pruning PrincipleNHHbest subgraphso farCCCCCHCCCCCCCCCCCCCCCbest scoreso farupper boundcurrentnodeCCCCPattern Search Treecurrentscoresub-treeCCCCCHCCCCCCCCCCCgHSICIf best score upper boundWe can prune the entire sub-tree#Experiment SetupFour methods are compared:Multi-label feature selection + Multi-label classificationgMLC [This Paper] + BoosTexter [Schapire & Singer 00]Multi-label feature selection + Binary classification gMLC [This Paper] + BR-SVM [Boutell et al 04] (Binary Relevance)Single-label feature selection + Binary classification BR (Binary Relevance) + Information Gain + SVMTop-k frequent subgraphs + Multi-label classificationgSpan [Yan & Han 02] + BoosTexter [Schapire & Singer 00]#Three multi-label graph classification tasks:Anti-cancer activity predictionToxicology prediction of chemical compoundsKinase inhibitor predictionData Sets

#EvaluationMulti-Label Metrics [Elisseef&Weston NIPS02]Ranking Loss Average number of label pairs being ranked incorrectlyThe smaller the better

Average Precision Average fraction of correct labels in top ranked labelsThe larger the better

10 times 10-fold cross-validation

#

Experiment ResultsAnti-CancerdatasetKinase Inhibition datasetPTC dataset

Ranking Loss1 AvePrec#

Experiment Results Ranking Loss(lower is better)Multi-Label FS +Multi-label ClassifierMulti-Label FS + Single-label ClassifiersSingle-Label FS +Single-label ClassifiersAnti-Cancer Dataset# Selected FeaturesOur approach with multi-label classifier performed best at NCI and PTC datasetsUnsupervised FS + Multi-label Classifier#Pruning Results

Running Time#Subgraph Explored#

Pruning ResultsWithoutgHSIC pruninggHSIC pruning(anti-cancer dataset)Running time(seconds)(lower is better)#

Pruning ResultsgHSICpruningWithoutgHSIC pruning# Subgraphs explored(lower is better)(anti-cancer dataset)#ConclusionsMulti-Label Feature Selection for Graph Classification

Evaluating subgraph features using multiple labels of the graphs (effective)Branch&bound pruning the search space using multiple labels of the graphs (efficient)

Thank you!#

Multi-Label Feature Selection for Graph Classification

Documents

Transcript of Multi-Label Feature Selection for Graph Classification