Decision tree and instance-based learning for label ranking

Weiwei Cheng, Jens Hühn and Eyke HüllermeierKnowledge Engineering & Bioinformatics Lab

Department of Mathematics and Computer ScienceUniversity of Marburg, Germany

Label Ranking (an example)

Learning customers’ preferences on cars:

where the customers could be described by feature vectors, e.g., (gender, age, place of birth, has child, …)

label ranking customer 1 MINI _ Toyota _ BMW

customer 2 BMW_ MINI_ Toyota

customer 3 BMW_ Toyota_ MINI

customer 4 Toyota_ MINI_ BMW

new customer ???

Label Ranking (an example)

Learning customers’ preferences on cars:

π(i) = position of the i‐th label in the ranking1: MINI 2: Toyota 3: BMW

MINI Toyota BMWcustomer 1 1 2 3

customer 2 2 3 1

customer 3 3 2 1

customer 4 2 1 3

new customer ? ? ?

Label Ranking (more formally)

Given:a set of training instances a set of labelsfor each training instance : a set of pairwise preferencesof the form (for some of the labels)

Find:A ranking function ( mapping) that maps each to a ranking of L (permutation ) and generalizes well in terms of a loss function on rankings (e.g., Kendall’s tau)

Existing Approaches… essentially reduce label ranking to classification:

Ranking by pairwise comparisonFürnkranz and Hüllermeier, ECML‐03

Constraint classification (CC)Har‐Peled , Roth and Zimak, NIPS‐03

Log linear models for label rankingDekel, Manning and Singer, NIPS‐03

− are efficient but may come with a loss of information− may have an improper bias and lack flexibility− may produce models that are not easily interpretable

nearest neighbor

Local Approach (this work)

Target function is estimated (on demand) in a local way.Distribution of rankings is (approx.) constant in a local region.Core part is to estimate the locally constant model.

decision tree

Local Approach (this work)

Output (ranking) of an instance x is generated according to a distribution on

This distribution is (approximately) constant within the local region under consideration.

Nearby preferences are considered as a sample generated by which is estimated on the basis of this sample via ML.

Probabilistic Model for Ranking Mallows model (Mallows, Biometrika, 1957)

with center rankingspread parameterand is a right invariant metric on permutations

Inference (complete rankings)

Rankings observed locally.

monotone in θ

Inference (incomplete rankings)

Probability of an incomplete ranking:

where denotes the set of consistent extensions of .

Example for label set {a,b,c}:

Observation σ Extensions E(σ)

a_ ba_ b _ ca_ c _ bc _ a _ b

Inference (incomplete rankings) cont.

The corresponding likelihood:

Exact MLE becomes infeasible when n is large. Approximation is needed.

1 2 4 3

Inference (incomplete rankings) cont.

Approximation via a variant of EM, viewing the non‐observed labels as hidden variables.

replace the E‐step of EM algorithm with a maximization step (widely used in learning HMM, K‐means clustering, etc.)

1. Start with an initial center ranking (via generalized Borda count)2. Replace an incomplete observation with its most probable

extension (first M‐step, can be done efficiently)3. Obtain MLE as in the complete ranking case (second M‐step)4. Replace the initial center ranking with current estimation5. Repeat until convergence

1 2 4 3

1 2 4 34 3 1 2 3 4

1 2 3 4

Inference

Not only the estimated ranking is of interest …

… but also the spread parameter , which is a measure of precision and, therefore, reflects the confidence/reliability of the prediction (just like the variance of an estimated mean).

The bigger , the more peaked the distribution around the center ranking.

high variance

low variance

Label Ranking TreesMajor modifications:

split criterion Split ranking set into and , maximizing goodness‐of‐fit

stopping criterion for partition1. tree is pure

any two labels in two different rankings have the same order

2. number of labels in a node is too small prevent an excessive fragmentation

Label Ranking TreesLabels: BMW, Mini, Toyota

Experimental Results

IBLR: instance‐based label ranking LRT: label ranking treesCC: constraint classification

Performance in terms of Kentall’s tau

Accuracy (Kendall’s tau)

Main observation: Local methods are more flexible and can exploit more preference information compared with the model‐based approach.

Typical “learning curves”:

0 0,1 0,2 0,3 0,4 0,5 0,6 0,7

ing pe

rforman

Probability of missing label

more amount of preference information less

Take‐away MessagesAn instance‐based method for label ranking using a probabilistic model.Suitable for complete and incomplete rankings.Comes with a natural measure of the reliability of a prediction. Makes other types of learners possible: label ranking trees.

o More efficient inference for the incomplete case. o Dealing with variants of the label ranking problem, such as calibrated label ranking and multi‐label classification.

Google “kebi germany” for more info.

Decision tree and instance-based learning for label ranking

Technology

Transcript of Decision tree and instance-based learning for label ranking

Solving the Partial Label Learning Problem: An Instance ... · Solving the Partial Label Learning Problem: An Instance-based Approach Min-Ling Zhang Fei Yu 1School of Computer Science

Distill-to-Label: Weakly Supervised Instance Labeling ...

SELIM AKSOY - Bilkent Universitysaksoy/aksoy_cv.pdf · Caner Mercan, Ph.D., \Deep Feature Representations and Multi-Instance Multi-Label Learning of Whole Slide Breast Histopathology

Label-Free Distant Supervision for Relation Extraction via … · 2018. 10. 28. · 2247 Figure 2: An instance of our label-free distant supervision method. in much noise. In this

Ranking 96, 3,92 · 2017-10-11 · ranking 70, 4,25 ranking 66, 4,29 ranking 50, 4,44 ranking 51, 4,44 ranking 47, 4,50 ranking 33, 4,71. consejo nacional de competitividad . consejo

A Selective Sampling Strategy for Label Ranking

Self-Distillation as Instance-Speciﬁc Label Smoothing

“Gateway” - Denver · 2015-06-25 · Ranking 1 to 10 Weight x Ranking; Ranking 1 to 10. Weight x Ranking; Ranking 1 to 10 Weight x Ranking. Ranking 1 to 10 Weight x Ranking; Ranking

Instance ranking using partially ordered sets of expert ...strijov.com/papers/Medvednikova2014POF.pdf · Instance ranking using partially ordered sets of expert estimations Medvednikova

Semantic Instance Annotation of Street Scenes by 3D to 2D Label … · 2016-05-16 · Semantic Instance Annotation of Street Scenes by 3D to 2D Label Transfer Jun Xie1 Martin Kiefel2

Deep Semantic Ranking Based Hashing for Multi … Semantic Ranking Based Hashing for Multi-Label Image Retrieval ... learn feature representations and mappings ... tion is employed

Instance Label Prediction by Dirichlet Process Multiple ...auai.org/uai2014/proceedings/individuals/54.pdfInstance Label Prediction by Dirichlet Process Multiple Instance Learning

808 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA … › ~xqzhu › papers › TKDE.Fu.2014.Active.pdfActive Learning without Knowing Individual Instance Labels: A Pairwise Label Homogeneity

Supervised Clustering of Label Ranking Data Mihajlo Grbovic, Nemanja Djuric, Slobodan Vucetic {mihajlo.grbovic, nemanja.djuric, slobodan.vucetic}@temple.edu.

Multi-instance Multi-label Learning for Multi-class ...shapiro/TMI1.pdf · label learning (MIMLL) corresponds to the combined case where each training sample is represented by a bag

A ranking is a ranking, a ranking is not …

Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Learning Query-Dependent Distance Metrics for Interactive ...staff.computing.dundee.ac.uk/stephen/PreprintICVS09.pdf · tures. Hu et al. [5] explored a multiple-instance ranking approach

arXiv:1809.11008v2 [cs.LG] 25 Jan 2019 · learning approach focuses on estimating the label transition matrix [5,6], which models the label corruption process. For instance, [22]

Efficient Learning of Label Ranking by Soft Projections ... · Lernziel: Label Ranking Function: I f : ˜!Rk mit R uckgabewert: I ~ 2Rk target vector / label ranking I ~ i 2R I f