Oregon State University – Intelligent Systems Group 8/22/2003ICML 2003 1 Giorgio Valentini...

8/22/2003 ICML 2003 1

Giorgio ValentiniDipartimento di Scienze dell Informazione

Università degli Studi di Milano, Italyvalentini@dsi.unimi.it

Thomas G. DietterichDepartment of Computer Science

Oregon State UniversityCorvallis, Oregon 97331 USAhttp://www.cs.orst.edu/~tgd

Low Bias Bagged Support Vector Machines

8/22/2003 ICML 2003 2

upTwo Questions:

Can bagging help SVMs? If so, how should SVMs be tuned to give

the best bagged performance?

8/22/2003 ICML 2003 3

upThe Answers

Can bagging help SVMs? Yes

If so, how should SVMs be tuned to give the best bagged performance? Tune to minimize the bias of each SVM

8/22/2003 ICML 2003 4

upSVMs

Soft Margin Classifier Maximizes VC dimension subject to soft separation of

the training data Dot product can be generalized using kernels K(xj,xi;) Set C and using an internal validation set

Excellent control of the bias/variance tradeoff: Is there any room for improvement?

minimize: ||w||2 + C i i

subject to: yi (w ¢ xi + b) + i ¸ 1

8/22/2003 ICML 2003 5

upBias/Variance Error Decomposition

for Squared Loss

For regression problems, loss is (ŷ – y)2

error2 = bias2 + variance + noise ES[(ŷ-y)2] = (ES[ŷ] – f(x))2 + ES[(ŷ – ES[ŷ])2] + E[(y – f(x))2]

Bias: Systematic error at data point x averaged over all training sets S of size N

Variance: Variation around the average Noise: Errors in the observed labels of x

8/22/2003 ICML 2003 6

upExample: 20 points

y = x + 2 sin(1.5x) + N(0,0.2)

8/22/2003 ICML 2003 7

upExample: 50 fits

(20 examples each)

8/22/2003 ICML 2003 8

upBias

8/22/2003 ICML 2003 9

upVariance

8/22/2003 ICML 2003 10

upNoise

8/22/2003 ICML 2003 11

upVariance Reduction and Bagging

Bagging attempts to simulate a large number of training sets and compute the average prediction ym of those training sets

It then predicts ym

If the simulation is good enough, this eliminates all of the variance

8/22/2003 ICML 2003 12

upBias and Variance for 0/1 Loss

(Domingos, 2000)

At each test point x, we have 100 estimates: ŷ1, …, ŷ100 2 {–1,+1}

Main prediction: ym = majority vote Bias(x) = 0 if ym is correct and 1 otherwise Variance(x) = probability that ŷ ym

Unbiased variance VU(x): variance when Bias = 0 Biased variance VB(x): variance when Bias = 1

Error rate(x) = Bias(x) + VU(x) – VB(x) Noise is assumed to be zero

8/22/2003 ICML 2003 13

upGood Variance and Bad Variance

Error rate(x) = Bias(x) + VU(x) – VB(x)

VB(x) is “good” variance, but only when the bias is high

VU(x) is “bad” variance

Bagging will reduce both types of variance. This gives good results if Bias(x) is small.

Goal: Tune classifiers to have small bias and rely on bagging to reduce variance

8/22/2003 ICML 2003 14

upLobag

Given: Training examples {(xi,yi)}N

Learning algorithm with tuning parameters Parameter settings to try {1,2,…}

Do: Apply internal bagging to compute out-of-bag

estimates of the bias of each parameter setting. Let * be the setting that gives minimum bias

Perform bagging using *

8/22/2003 ICML 2003 15

upExample:

Letter2, RBF kernel, = 100

minimum error minimum

8/22/2003 ICML 2003 16

upExperimental Study

Seven data sets: P2, waveform, grey-landsat, spam, musk, letter2 (letter recognition ‘B’ vs ‘R’), letter2+noise (20% added noise)

Three kernels: dot product, RBF ( = gaussian width), polynomial ( = degree)

Training set: 100 examples Final classifier is bag of 100 SVMs

trained with chosen C and

8/22/2003 ICML 2003 17

upResults: Dot Product Kernel

8/22/2003 ICML 2003 18

upResults (2): Gaussian Kernel

8/22/2003 ICML 2003 19

upResults (3): Polynomial Kernel

8/22/2003 ICML 2003 20

upMcNemar’s Tests:

Bagging versus Single SVM

Linear Polynomial RBF

Losses

8/22/2003 ICML 2003 21

upMcNemar’s Test:

Lobag versus Single SVM

Losses

8/22/2003 ICML 2003 22

upMcNemar’s Test:

Lobag versus Bagging

Losses

8/22/2003 ICML 2003 23

upResults: McNemar’s Test

(wins – ties – losses)

KernelLobag vs Bagging

Lobag vs Single

Bagging vs Single

Linear 3 – 26 – 1 23 – 7 – 0 21 – 9 – 0

Polynomial 12 – 23 – 0 17 – 17 – 1 9 – 24 – 2

Gaussian 17 – 17 – 1 18 – 15 – 2 9 – 22 – 4

Total 32 – 66 – 2 58 – 39 – 3 39 – 55 – 6

8/22/2003 ICML 2003 24

upDiscussion

For small training sets Bagging can improve SVM error rates, especially

for linear kernels Lobag is at least as good as bagging and often

better

Consistent with previous experience Bagging works better with unpruned trees Bagging works better with neural networks that are

trained longer or with less weight decay

8/22/2003 ICML 2003 25

upConclusions

Lobag is recommended for SVM problems with high variance (small training sets, high noise, many features)

Added cost: SVMs require internal validation to set C and Lobag requires internal bagging to estimate bias for

each setting of C and Future research:

Smart search for low-bias settings of C and Experiments with larger training sets

Oregon State University – Intelligent Systems Group 8/22/2003ICML 2003 1 Giorgio Valentini...

Documents

Transcript of Oregon State University – Intelligent Systems Group 8/22/2003ICML 2003 1 Giorgio Valentini...

Indagine scientifica ed informazione veterinaria Results · Indagine scientifica ed informazione veterinaria Reserved to Veterinary Surgeons and Pharmacists. Results Under particular

Biodiversità animale e sistemi zootecnici Alessio Valentini 13/04/2010 Accademia Nazionale delle Scienze detta dei XL.

Sansoni & Valentini - input2012

Valentini lifespace - livingspace 1

UNIVERSITA’ DI BERGAMO Sergio Valentini Barriere e facilitaz all... · Sergio Valentini © 2012 UNIVERSITA’ DI BERGAMO Sergio Valentini Barriere, vincoli e facilitazioni al commercio

Carbon Nanotubes: Synthesis and Functionalizationacletters.org/procs/1st_workshop/abstracts/lectures/valentini... · Carbon Nanotubes: Synthesis and Functionalization L. Valentini

Marcello Valentini

STOCHASTIC PETRI NETS: AN ELEMENTARY INTRODUCTION · STOCHASTIC PETRI NETS: AN ELEMENTARY INTRODUCTION M. Ajmone Marsan Dipartimento di Scienze dell' Informazione UniversitA di Milano,

Valentini - Opposition to MSJ

Informazione Geografica, Città, Smartness

Valentini lifespace - bed sartoria

Sara Valentini, Elisa Montaguti, & Scott A. Neslin … · 2019. 3. 7. · Sara Valentini, Elisa Montaguti, & Scott A. Neslin DecisionProcessEvolutionin CustomerChannelChoice The growing

List & Valentini: Freedom as Independence (2016)

Valentini lifespace - sofa collection

Entropia di sistemi di informazione incompleti

Multicriteria Decision Aid to support Multilateral ... · DM Decision maker R. D. Co´ndor (&) R. Valentini Dipartimento di Scienze dell’Ambiente Forestale e delle sue Risorse (DISAFRI),

Valentini Amarantidou, Working with European Agencies

Chiti T 1, Smith P 2, Dalmonech D 1, Yeluripaty J 2, Matteucci G 3, Rodeghiero M 4, Valentini R 1, Papale D 1 Dip. di Scienze dellAmbiente Forestale e.

Catalogo cucine: Cucinando 2015 by Valentini

Polo bibliotecario di Scienze, Farmacologia e Scienze ...polodiscienze.cab.unipd.it/system/files/Scientific_communication... · Polo bibliotecario di Scienze, Farmacologia e Scienze