Competitions in machine learning: the fun, the art, and the science Isabelle Guyon Clopinet,...

25
Competitions in machine learning: the fun, the art, and the science Isabelle Guyon Clopinet, Berkeley, California http://clopinet.com/challenges [email protected]

Transcript of Competitions in machine learning: the fun, the art, and the science Isabelle Guyon Clopinet,...

Page 1: Competitions in machine learning: the fun, the art, and the science Isabelle Guyon Clopinet, Berkeley, California  isabelle@clopinet.com.

Competitions in machine learning: the fun, the art, and the science

Isabelle GuyonClopinet, Berkeley, California

http://clopinet.com/challenges [email protected]

Page 2: Competitions in machine learning: the fun, the art, and the science Isabelle Guyon Clopinet, Berkeley, California  isabelle@clopinet.com.

My Itinerary

Page 3: Competitions in machine learning: the fun, the art, and the science Isabelle Guyon Clopinet, Berkeley, California  isabelle@clopinet.com.

My Kids

Page 4: Competitions in machine learning: the fun, the art, and the science Isabelle Guyon Clopinet, Berkeley, California  isabelle@clopinet.com.

My Company

Page 5: Competitions in machine learning: the fun, the art, and the science Isabelle Guyon Clopinet, Berkeley, California  isabelle@clopinet.com.

Recent Projects

Melanoma AppDrug toxicity via

flow cytometry

Page 6: Competitions in machine learning: the fun, the art, and the science Isabelle Guyon Clopinet, Berkeley, California  isabelle@clopinet.com.

How to keep up with the s.o.a?

Page 7: Competitions in machine learning: the fun, the art, and the science Isabelle Guyon Clopinet, Berkeley, California  isabelle@clopinet.com.

Organize Challenges!

Page 9: Competitions in machine learning: the fun, the art, and the science Isabelle Guyon Clopinet, Berkeley, California  isabelle@clopinet.com.

Gesture and action recognition

Image or video indexing/retrieval

Recognition of sign languages

Handwriting recognition

Text processing

Ecology

Applications

Page 10: Competitions in machine learning: the fun, the art, and the science Isabelle Guyon Clopinet, Berkeley, California  isabelle@clopinet.com.

Free registrations, cash prizes, 2 workshops (IJCNN 2011, ICML2011),

proceedings in JMLR W&CP, much fun! http://clopinet.com/ulhttp://clopinet.com/ul

Just starting …

First ULT challenge

Page 11: Competitions in machine learning: the fun, the art, and the science Isabelle Guyon Clopinet, Berkeley, California  isabelle@clopinet.com.

Large database of American Sign Language developed at Boston University (http://www.bu.edu/asllrp/), including 3,800 signs with 3,000 unique class labels produced by native signers and 15 short narratives for a total of over 11,000 sign tokens altogether.

Gesture Recognition Challenge

Page 12: Competitions in machine learning: the fun, the art, and the science Isabelle Guyon Clopinet, Berkeley, California  isabelle@clopinet.com.

STEP 1: Develop a system that can learn new signs with a few examples. First screening of the competitors based on their performance of a validation dataset.

STEP 2: On the site of the life competition, train the system with given “flash cards” to recognize new signs. Second screening of the competitors based on the learning curve of their system.

STEP 3: Perform short sequences of signs (charades) in front of an audience. Win if your system gets the best recognition rate.

Live Competition

Page 13: Competitions in machine learning: the fun, the art, and the science Isabelle Guyon Clopinet, Berkeley, California  isabelle@clopinet.com.

DECEMBER 2011DECEMBER 2011

NIPS 2001: Launching of UTL challenge.

JUNE 2011JUNE 2011

Workshop at CVPR 2011 (accepted) Launching Gesture Recognition

challenge.

JULY 2011JULY 2011

Workshop at ICML 2011 (planned).

AUGUST 2011AUGUST 2011

Workshop at IJCNN 2011 (accepted). Results of UTL challenge.

NOVEMBER 2011 NOVEMBER 2011

Live Gesture Recognition Competition: ICCV 2011 (planned).

When and Where?

Page 14: Competitions in machine learning: the fun, the art, and the science Isabelle Guyon Clopinet, Berkeley, California  isabelle@clopinet.com.

Lessons learned!

Page 15: Competitions in machine learning: the fun, the art, and the science Isabelle Guyon Clopinet, Berkeley, California  isabelle@clopinet.com.

• Thousands to millions of low level features: select the most relevant one to build better, faster, and easier to understand learning machines.

X

n

m

n’

NIPS 2003 Feature Selection Challenge

Page 16: Competitions in machine learning: the fun, the art, and the science Isabelle Guyon Clopinet, Berkeley, California  isabelle@clopinet.com.

Bioinformatics

Quality control

Machine vision

Customer knowledge

variables/features

examples

10

102

103

104

105

OCRHWR

MarketAnalysis

TextCategorization

Syst

em

diag

nosi

s

10 102 103 104 105

106

Applications of Feature Selection

Page 17: Competitions in machine learning: the fun, the art, and the science Isabelle Guyon Clopinet, Berkeley, California  isabelle@clopinet.com.

All features FilterFeature subset Predictor

All features

Wrapper

Multiple Feature subsets

Predictor

All featuresEmbedded

method

Feature subset

Predictor

Filters, Wrappers, and Embedded Methods

Page 18: Competitions in machine learning: the fun, the art, and the science Isabelle Guyon Clopinet, Berkeley, California  isabelle@clopinet.com.

1) For each feature subset, train predictor on training data.

2) Select the feature subset, which performs best on validation data.

–Repeat and average if you want to reduce variance (cross-validation).

3) Test on test data.

N variables/features

M s

ampl

es

m1

m2

m3

Split data into 3 sets:training, validation, and test set.

Bilevel Optimization

Page 19: Competitions in machine learning: the fun, the art, and the science Isabelle Guyon Clopinet, Berkeley, California  isabelle@clopinet.com.

Method Number of subsets tried

Complexity C

Exhaustive search wrapper

2N N

Nested subsets Feature ranking

N(N+1)/2 or N

log N

Generalization_error Validation_error + (C/m2)

m2: number of validation examples, N: total number of features,n: feature subset size.

With high probability:

n

Error

Try to keep C of the order of m2.

Complexity of Feature Selection

Page 20: Competitions in machine learning: the fun, the art, and the science Isabelle Guyon Clopinet, Berkeley, California  isabelle@clopinet.com.

Lung Cancer

Smoking Genetics

Coughing

AttentionDisorder

Allergy

Anxiety Peer Pressure

Yellow Fingers

Car Accident

Born an Even Day

Fatigue

WCCI 2008: Causation and Prediction Challenge

Page 21: Competitions in machine learning: the fun, the art, and the science Isabelle Guyon Clopinet, Berkeley, California  isabelle@clopinet.com.

Simple univariate predictive model, binary target and features, all relevant features correlate perfectly with the target, all irrelevant features randomly drawn. With 98% confidence, abs(feat_weight) < w and i wixi < v.

ng number of “good” (relevant) features

nb number of “bad” (irrelevant) features

m number of training examples.

Insensitivity to Irrelevant Features

Page 22: Competitions in machine learning: the fun, the art, and the science Isabelle Guyon Clopinet, Berkeley, California  isabelle@clopinet.com.

Active Learning Challenge AISTATS & WCCI 2010

Page 23: Competitions in machine learning: the fun, the art, and the science Isabelle Guyon Clopinet, Berkeley, California  isabelle@clopinet.com.

Web platform: Server made available by Prof. Joachim Buhmann, ETH Zurich, Switzerland. Computer admin.: Thomas Fuchs, ETH Zurich. Webmaster: Olivier Guyon, MisterP.net, France.

Protocol review and advising: • David W. Aha, Naval Research Laboratory, USA.• Abe Schneider, Knexus Research, USA.• Graham Taylor, NYU, New-York. USA.• Andrew Ng, Stanford Univ., Palo Alto, California, USA• Vassilis Athitsos, University of Texas at Arlington, Texas, Usa.• Ivan Laptev, INRIA, France.• Jitendra Malik, UC Berkeley, California, USA• Christian Vogler, ILSP Athens, Greece.• Sudeep Sarkar, University of South Florida, USA.• Philippe Dreuw, RWTH Aachen University, Germany.• Richard Bowden, Univ. Surrey, UK.• Greg Mori, Simon Fraser University, Canada.

Data collection and preparation: • Vassilis Athitsos, University of Texas at Arlington, Texas, USA• Isabelle Guyon, Clopinet, California, USA.• Graham Taylor, NYU, New-York. USA.• Ivan Laptev, INRIA, France.• Jitendra Malik, UC Berkeley, California, USA.

Baseline methods and beta testing:The following researchers experienced in the domain will be providing baseline results:• Vassilis Athitsos, University of Texas at Arlington, Texas, USA.• Graham Taylor, NYU, New-York. USA.• Andrew Ng, Stanford Univ., Palo Alto, California, USA.• Yann LeCun, NYU. New-York, USA.• Ivan Laptev, INRIA, France.

The following researchers were top ranking participants in past challenges but are not experienced in the domain will also give it a try:• Alexander Borisov (Intel, USA) • Hugo-Jair Escalante (INAOE, México) • Amir Saffari (Graz Univ., Austria)• Alexander Statnikov (NYU, USA)

Credits

Page 24: Competitions in machine learning: the fun, the art, and the science Isabelle Guyon Clopinet, Berkeley, California  isabelle@clopinet.com.

1) Feature Extraction, Foundations and ApplicationsI. Guyon, S. Gunn, et al.Springer, 2006.

http://clopinet.com/fextract-book

2) Challenges in Machine LearningCollection published by Microtome.Papers on the challenges reprinted from JMLR and JMLR W&CP

Resources

Page 25: Competitions in machine learning: the fun, the art, and the science Isabelle Guyon Clopinet, Berkeley, California  isabelle@clopinet.com.

Join the Hall of Frame!