Applied Machine Learning

44
Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 1 22C3, Berlin, 27.12.2005 Timon Schroeter Konrad Rieck Soeren Sonnenburg Intelligent Data Analysis Group Fraunhofer FIRST http://ida.first.fhg.de/ Applied Machine Learning

description

 

Transcript of Applied Machine Learning

Page 1: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 122C3, Berlin, 27.12.2005

Timon Schroeter

Konrad Rieck

Soeren Sonnenburg

Intelligent Data Analysis Group

Fraunhofer FIRST

http://ida.first.fhg.de/

Applied Machine Learning

Page 2: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 222C3, Berlin, 27.12.2005

Roadmap

• Some Background

• SVMs & Kernels

• Applications

Rationale: Let computers learn, to allow humans to

to automate processes

to understand highly complex data

Page 3: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 322C3, Berlin, 27.12.2005

Example: Spam Classification

From: [email protected]: CongratulationsDate: 16. December 2004 02:12:54 MEZ

LOTTERY COORDINATOR,INTERNATIONAL PROMOTIONS/PRIZE AWARD DEPARTMENT.SMARTBALL LOTTERY, UK.

DEAR WINNER,

WINNER OF HIGH STAKES DRAWS

Congratulations to you as we bring to your notice, theresults of the the end of year, HIGH STAKES DRAWS ofSMARTBALL LOTTERY UNITED KINGDOM. We are happy to inform youthat you have emerged a winner under the HIGH STAKES DRAWSSECOND CATEGORY,which is part of our promotional draws. Thedraws were held on15th DECEMBER 2004 and results are beingofficially announced today. Participants were selected

through a computer ballot system drawn from 30,000names/email addresses of individuals and companies fromAfrica, America, Asia, Australia,Europe, Middle East, andOceania as part of our International Promotions Program.

From: [email protected]: ML Positions in Santa CruzDate: 4. December 2004 06:00:37 MEZ

We have a Machine Learning positionat Computer Science Department ofthe University of California at Santa Cruz(at the assistant, associate or full professor level).

Current faculty members in related areas:Machine Learning: DAVID HELMBOLD and MANFRED WARMUTHArtificial Intelligence: BOB LEVINSONDAVID HAUSSLER was one of the main ML researchers in ourdepartment. He now has launched the new Biomolecular Engineeringdepartment at Santa Cruz

There is considerable synergy for Machine Learning at SantaCruz:-New department of Applied Math and Statistics with an emphasison Bayesian Methods http://www.ams.ucsc.edu/-- New department of Biomolecular Engineeringhttp://www.cbse.ucsc.edu/

Goal: Classify emails into spam / no spam

How? Learn from previously labeled emails!

Training: analyze previous emailsApplication: classify new emails

Page 4: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 422C3, Berlin, 27.12.2005

Problem Formulation

Natural+1

Natural+1

Plastic-1

Plastic-1 ?

The “World”:• Data: Pairs (x, y)

• Featurevector x

• Individual features e.g. x R• e.g. Volume, Mass, RGB-Channels

• Lables y { +1, -1}

• Unknown Target Function y = f(x)• Unknown Distribution x ~ p(x)

• Objective: Given new x predict y

...

Page 5: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 522C3, Berlin, 27.12.2005

Premises for Machine Learning

• Supervised Machine Learning

• Observe N training examples with label

• Learn function

• Predict label of unseen example

• Examples generated from statistical process

• Relationship between features and label

• Assumption: unseen examples are generatedfrom same or similar process

Page 6: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 622C3, Berlin, 27.12.2005

Problem Formulation

Page 7: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 722C3, Berlin, 27.12.2005

Problem Formulation

• Want model to generalize

• Need to find a good level of complexity

x

y

complexity

training ( )

test ( )

erro

r

• In practice e.g. model / parameterselection via crossvalidation

Page 8: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 822C3, Berlin, 27.12.2005

Example: Natural vs. Plastic Apples

Page 9: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 922C3, Berlin, 27.12.2005

Example: Natural vs. Plastic Apples

Page 10: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 1022C3, Berlin, 27.12.2005

Linear Separation

property 1

pro

pert

y2

Page 11: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 1122C3, Berlin, 27.12.2005

Linear Separation

property 1

?

pro

pert

y2

Page 12: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 1222C3, Berlin, 27.12.2005

Linear Separation with Margins

property 1

pro

pe

rty

2

property 1

?

large margin => good generalization

{margin

pro

pert

y2

Page 13: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 1322C3, Berlin, 27.12.2005

Large Margin Separation

{margin

Idea:• Find hyperplanethat maximizes margin

(with )• Use for prediction

Solution:• Linear combination of examples

• many ’s are zero

• Support Vector Machines

Demo

Page 14: Applied Machine Learning
Page 15: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 1522C3, Berlin, 27.12.2005

Example: Polynomial Kernel

Page 16: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 1622C3, Berlin, 27.12.2005

Support Vector Machines

• Demo: Gaussian Kernel

• Many other algorithms can use kernels

• Many other application specific kernels

Page 17: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 1722C3, Berlin, 27.12.2005

Capabilities of Current Techniques

• Theoretically & algorithmically well understood:

• Classification with few classes

• Regression (real valued)

• Novelty / Anomaly Detection

Bottom Line: Machine Learning

works well for relatively simple

objects with simple properties

• Current Research

• Complex objects

• Many classes

• Complex learning setup (active learning)

• Prediction of complex properties

Page 18: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 1822C3, Berlin, 27.12.2005

Capabilities of Current Techniques

• Theoretically & algorithmically well understood:

• Classification with few classes

• Regression (real valued)

• Novelty / Anomaly Detection

Bottom Line: Machine Learning

works well for relatively simple

objects with simple properties

• Current Research

• Complex objects

• Many classes

• Complex learning setup (active learning)

• Prediction of complex properties

Page 19: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 1922C3, Berlin, 27.12.2005

Capabilities of Current Techniques

• Theoretically & algorithmically well understood:

• Classification with few classes

• Regression (real valued)

• Novelty / Anomaly Detection

Bottom Line: Machine Learning

works well for relatively simple

objects with simple properties

• Current Research

• Complex objects

• Many classes

• Complex learning setup (active learning)

• Prediction of complex properties

Page 20: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 2022C3, Berlin, 27.12.2005

Capabilities of Current Techniques

• Theoretically & algorithmically well understood:

• Classification with few classes

• Regression (real valued)

• Novelty / Anomaly Detection

Bottom Line: Machine Learning

works well for relatively simple

objects with simple properties

• Current Research

• Complex objects

• Many classes

• Complex learning setup (active learning)

• Prediction of complex properties

Page 21: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 2122C3, Berlin, 27.12.2005

Capabilities of Current Techniques

• Theoretically & algorithmically well understood:

• Classification with few classes

• Regression (real valued)

• Novelty / Anomaly Detection

Bottom Line: Machine Learning

works well for relatively simple

objects with simple properties

• Current Research

• Complex objects

• Many classes

• Complex learning setup (active learning)

• Prediction of complex properties

Page 22: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 2222C3, Berlin, 27.12.2005

Many Applications

• Handwritten Letter/Digit recognition

• Gene Finding

• Drug Discovery

• Brain-Computer Interfacing

• Intrusion Detection Systems (unsupervised)

• Document Classification (by topic, spam mails)

• Face/Object detection in natural scenes

• Non-Intrusive Load Monitoring of electric appliances

• Company Fraud Detection (Questionaires)

• Fake Interviewer identification (e.g. in social studies)

• Optimized Disk caching strategies

• Speaker recognition (e.g. on tapped phonelines)

• …

Page 23: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 2322C3, Berlin, 27.12.2005

Will discuss in more Detail:

• Handwritten Letter/Digitrecognition

• Drug Discovery

• Fun examples

• Gene Finding

• Brain-Computer Interfacing

Want to try this at home?

• Libsvm (C++) http://www.csie.ntu.edu.tw/~cjlin/libsvm/

• Torch (Java, C++) http://torch.ch

• Numarray (Python) http://sourceforge.net/projects/numpy

Page 24: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 2422C3, Berlin, 27.12.2005

MNIST Benchmark

SVM with polynomial kernel(considers d-th order correlations of pixels)

Page 25: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 2522C3, Berlin, 27.12.2005

MNIST Error Rates

Page 26: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 2622C3, Berlin, 27.12.2005

Drug Discovery / PCADMET

• To be inserted later

Page 27: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 2722C3, Berlin, 27.12.2005

File Analysis: Sourcecode

Pseudocode for Visualisation

Determine distances between all

(pairs of) files

Find and count all n-Grams in

each file (gives histograms)

Distance meaure for histograms

of n-grams is the Canberra-

distance

Calculate kernel matrix

Calculate eigenvalues and

eigenvectors of kernel matrix

(PCA)

Plot the two PCA components with

largest variance

Page 28: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 2822C3, Berlin, 27.12.2005

File Analysis: Binary Code

Pseudocode for Visualisation

Determine distances between all

(pairs of) files

Find and count all n-Grams in

each file (gives histograms)

Distance meaure for histograms

of n-grams is the Canberra-

distance

Calculate kernel matrix

Calculate eigenvalues and

eigenvectors of kernel matrix

(PCA)

Plot the two PCA components with

largest variance

Page 29: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 2922C3, Berlin, 27.12.2005

Fun Examples: Linux vs. OpenBSD

Visuell, 2 Dimensions

2 / 3 correct?

SVM, 2 Dimensions

73 % korrekt

SVM, 50 Dimensions

95 % korrekt

Page 30: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 3022C3, Berlin, 27.12.2005

A Bioinformatics Application

Page 31: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 3122C3, Berlin, 27.12.2005

Finding Genes on Genomic DNA

Splice Sites: on the boundary• Exons (may code for protein)• Introns (noncoding)

Page 32: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 3222C3, Berlin, 27.12.2005

Application: Splice Site Detection

Engineering Support Vector Machine (SVM) Kernels

That Recognize Splice Sites

Page 33: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 3322C3, Berlin, 27.12.2005

2-class Splice Site DetectionWindow of 150nt

around known splice sites

Positive examples: fixed window around a true splice siteNegative examples: generated by shifting the window

Design of new Support Vector Kernel

Page 34: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 3422C3, Berlin, 27.12.2005

Single Trial Analysis of EEG:towards BCI

Gabriel Curio Benjamin Blankertz Klaus-Robert Müller

Intelligent Data Analysis Group, Fraunhofer-FIRSTBerlin, Germany

Neurophysics GroupDept. of NeurologyKlinikum BenjaminFranklinFreie UniversitätBerlin, Germany

Page 35: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 3522C3, Berlin, 27.12.2005

Cerebral Cocktail Party Problem

Page 36: Applied Machine Learning
Page 37: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 3722C3, Berlin, 27.12.2005

The Cocktail Party Problem

• input: 3 mixed signals

• algorithm: enforce independence(“independent component analysis”)via temporal de-correlation

• output: 3 separated signals

(Demo: Andreas Ziehe, Fraunhofer FIRST, Berlin)

"Imagine that you are on the edge of a lake and a friend challenges you to play a game. The gameis this: Your friend digs two narrow channels up from the side of the lake […]. Halfway up each one,your friend stretches a handkerchief and fastens it to the sides of the channel. As waves reach theside of the lake they travel up the channels and cause the two handkerchiefs to go into motion. You

are allowed to look only at the handkerchiefs and from their motions to answer a series of

questions: How many boats are there on the lake and where are they? Which is the most powerfulone? Which one is closer? Is the wind blowing?” (Auditory Scene Analysis, A. Bregman )

Page 38: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 3822C3, Berlin, 27.12.2005

Minimal Electrode Configuration

• coverage: bilateral primary

sensorimotor cortices

• 27 scalp electrodes

• reference: nose

• bandpass: 0.05 Hz - 200 Hz

• ADC 1 kHz

• downsampling to 100 Hz

• EMG (forearms bilaterally):m. flexor digitorum

• EOG

• event channel:

keystroke timing

(ms precision)

Page 39: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 3922C3, Berlin, 27.12.2005

Single Trial vs. Averaging

-500 -400 -300 -200 -100 0 [ms]

-15

-10

-5

0

5

10

15

-500 -400 -300 -200 -100 0 [ms]

-15

-10

-5

0

5

10

15[ V]

-600 -500 -400 -300 -200 -100 0 [ms]

-15

-10

-5

0

5

10

15

-600 -500 -400 -300 -200 -100 0 [ms]

-15-10

-5

0

5

10

15[ V]

LEFT

hand(ch. C4)

RIGHThand

(ch. C3)

Page 40: Applied Machine Learning
Page 41: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 4122C3, Berlin, 27.12.2005

BCI Demo: BrainPong

Page 42: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 4222C3, Berlin, 27.12.2005

BCI Demo: BrainPong

• Video 1 Player

• Video 2 Player

Page 43: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 4322C3, Berlin, 27.12.2005

Concluding Remarks

• Computational Challenges• Algorithms can work with 100.000’s of examples

(need operations)

• Usually model parameters to be tuned

(cross-validation is computationally expensive)

• Need computer clusters and

Job scheduling systems (pbs, gridengine)

• Often use MATLAB

(to be replaced by python ?!)

• Machine learning is an exciting research area …

• … involving Computer Science, Statistics & Mathematics

• … with…• a large number of present and future applications (in all situations

where data is available, but explicit knowledge is scarce)…

• an elegant underlying theory…

• and an abundance of questions to study.

• Always looking for motivated students, Ph.D. Students, post-docs

Page 44: Applied Machine Learning

Timon Schroeter, Konrad Rieck, Sören Sonnenburg Applied Machine Learning 4422C3, Berlin, 27.12.2005

Thanks for Your Attention!

Speakers at 22c3: Timon Schroeter, Konrad Rieck, Sören Sonnenburg

[timon, rieck, sonne]@first.fhg.de, http://ida.first.fhg.de

Contributors / Coworkers: Klaus-Robert Müller, Jens Kohlmorgen,

Benjamin Blankertz, Alex Zien, Motoaki Kawanabe, Pavel Laskov,

Gilles Blanchard, Bernhard Schoelkopf, Anton Schwaighofer,

Guido Nolte, Florin Popescu, Stefan Harmeling, Julian Laub,Andreas Ziehe, Steven Lemm, Christin Schäfer, Guido Dornhege,

Frank Meinecke, Matthias Krauledat, Patrick Düssel,

Special Thanks: Gunnar Rätsch (speaker at 21c3, slides)