Adaptive pre-processing for streaming data

36
www.infer.eu Adaptive pre-processing for streaming data Indrė Žliobaitė [email protected] STRC, Bournemouth University 2011 September 22

description

Many supervised learning approaches that adapt to changes in data distribution over time (e.g. concept drift) have been developed. The majority of them assume that data comes already pre-processed or that pre-processing is an integral part of a learning algorithm. In real application tasks data that comes from, e.g. sensor readings, is typically noisy, contains missing values, redundant features and a large part of model training needs to be devoted to data cleaning and pre-processing. As data is evolving over time, not only learning models, but also pre-processing mechanisms need to adapt. We will discuss under what circumstances it is beneficial to handle adaptivity of pre-processing and adaptivity of the learning model separately.

Transcript of Adaptive pre-processing for streaming data

Page 1: Adaptive pre-processing for streaming data

www.infer.eu

Adaptive pre-processing for streaming data

Indrė Žliobaitė[email protected], Bournemouth University

●2011 September 22

Page 2: Adaptive pre-processing for streaming data
Page 3: Adaptive pre-processing for streaming data

INFER project 2010-2014

Computational INtelligence platform For Evolving and Robust predictive systems

● EC project within the Marie Curie Industry-Academia Partnerships & Pathways (IAPP), 1,55 MEUR

● Three partners from UK, Germany, Poland

● extended secondments for 23 researchers to move sector and country for industry-academia knowledge sharing

Page 4: Adaptive pre-processing for streaming data

Objectives

● Area 1: computational intelligence● advanced mechanisms for adaptation● multi-component multi-level evolving predictive systems● robustness and complexity management

● Area 2: software engineering● software platform for building robust predictive systems

● Area 3: process industry applications● adaptive and self-monitoring soft sensors for process industry

Page 5: Adaptive pre-processing for streaming data

Objectives

● Area 1: computational intelligence● advanced mechanisms for adaptation● multi-component multi-level evolving predictive systems● robustness and complexity management

● Area 2: software engineering● software platform for building robust predictive systems

● Area 3: process industry applications● adaptive and self-monitoring soft sensors for process industry

Soft sensor – a computational model in process industry.Outputs are computed using sensor readings as inputs.

Page 6: Adaptive pre-processing for streaming data

Pecularities of the problem setting

Typical data streams setting● mostly classification tasks

● not identically distributed over time

Industrial process setting● mostly regression* tasks

● not identically and not independently distributed

● data not iid

Page 7: Adaptive pre-processing for streaming data

Pecularities of the problem setting

Typical data streams setting● mostly classification tasks

● not identically distributed over time

● assumes that data arrives clean and pre-processed

● typically assumes immediate feedback

● optimizes accuracy and speed

Industrial process setting● mostly regression* tasks

● not identically and not independently distributed

● emphasis on data preparation and pre-processing, denoising, handling missing values

● feedback is lagging, costly or not available at all

● +emphasis on robustness (reliability, confidence)

● data not iid

Page 8: Adaptive pre-processing for streaming data

Pecularities of the problem setting

Typical data streams setting● mostly classification tasks

● not identically distributed over time

● assumes that data arrives clean and pre-processed

● typically assumes immediate feedback

● optimizes accuracy and speed

Industrial process setting● mostly regression* tasks

● not identically and not independently distributed

● emphasis on data preparation and pre-processing, denoising, handling missing values

● feedback is lagging, costly or not available at all

● +emphasis on robustness (reliability, confidence)

● data not iid

Page 9: Adaptive pre-processing for streaming data

Adaptive learning systems

Page 10: Adaptive pre-processing for streaming data

Example: data stream

Chemical production plantgiven sensor readingspredict the quality of the output24/7 plant operation

Process changes

Model does not change

source: Evonik Industries

Page 11: Adaptive pre-processing for streaming data

Adaptive online learning

● Data arrives online, neverending● Data distribution is changing over time● Limited access to historical data

● in large data streams – no access

● Predictive models need to have adaptation mechanisms ● update or retrain and replace models to match recent data● otherwise accuracy will degrade over time

Page 12: Adaptive pre-processing for streaming data

Adaptive online learning

strategies

REGULARLY EVOLVINGe.g. training

windows

WITH TRIGGERSe.g. change

detectiors

...

model

...

model model

...

model

...

model

change

+singel model or ensemble of models

Page 13: Adaptive pre-processing for streaming data

Adaptive learningmode

INCREMENTALupdate model

RETRAININGreplace model

oldmodel

... ...

oldmodel

...

newmodel

Page 14: Adaptive pre-processing for streaming data

Adaptive learningmode

INCREMENTALupdate model

INSTANCEupdate with every

new instance

BATCHupdate

in batches

Ensembleadd/ remove

models

FULLretrain with

new data

PARTIALreplace part

of model

RETRAININGreplace model

Page 15: Adaptive pre-processing for streaming data

Current situation

● Many adaptive learning approaches are available● Majority of the existing approaches assume that

● data comes already pre-processed, or– data analysts say that data preparation takes 80-90% of modelling time

● pre-processing is trained at the begining and remains fixed,or – limited adaptivity of the system

● tied, pre-processing adapts whenever predictor adapts

Page 16: Adaptive pre-processing for streaming data

...

Fixed pre-processing

raw data stream

pre-processed

predictions

train pre-processing

train predictor

adapt predictor

...

...

Tied pre-processing

raw data stream

pre-processed

predictions

train pre-processing

and train predictor

adapt predictor

...

re-train pre-processing

adapt predictor

re-train pre-processing

adapt predictor

Page 17: Adaptive pre-processing for streaming data

Current situation

● Many adaptive learning approaches are available● Majority of the existing approaches assume that

● data comes already pre-processed, or– data analysts say that data preparation takes 80-90% of modelling time

● pre-processing is trained at the begining and remains fixed,or – limited adaptivity of the system

● tied, pre-processing adapts whenever predictor adapts● It may be beneficial to decouple adaptation of pre-

processing from adaptation of predictor

Page 18: Adaptive pre-processing for streaming data

... raw data stream

pre-processed

predictions

train pre-processing

and train predictor

adapt predictor

...

re-train pre-processing

re-train pre-processing

adapt predictor

Decoupled adaptivity

Page 19: Adaptive pre-processing for streaming data

Adaptive pre-processing?

Page 20: Adaptive pre-processing for streaming data

Online predictive model

MODEL

...

2 . output prediction

3. receive feedback

4. update model

1. receive current data

...

5. receive new data

Page 21: Adaptive pre-processing for streaming data

Online prediction system

PRE-PROCESSING

...

2 . output prediction

3. receive feedback

4. update model

1. receive current data

...

5. receive new data

PREDICTOR

??

Page 22: Adaptive pre-processing for streaming data

Decoupling adaptivity – why?

● Forced● Different modes of adaptivity: predictor updates

incrementally, pre-processing need retraining (batches)● May be beneficial

● one of the elements may be still good enough– changes in data do not change the relation between concepts

(classes) in data, e.g. change in noise● different amounts of training data required

Page 23: Adaptive pre-processing for streaming data

Different amounts of data for accurate training

● synthetic Gaussian data, binary classification problem● assume known change point

STATIC SITUATION DATA STREAM

Page 24: Adaptive pre-processing for streaming data

Challenges

● Consistency of feature representation over time● Consistency of feedback over time

raw data

prediction

Predictionelement

Pre-processingelement

1. transformed data2. feedback

Page 25: Adaptive pre-processing for streaming data

1. Example: feature representation

If we modify pre-processing element, input to predictive element changes

Page 26: Adaptive pre-processing for streaming data

Challenges

raw data

prediction

Predictionelement

Pre-processingelement

1. transformed data2. feedback

adaptive mode ofPre-processing el. incremental incremental retrain retrainPredictive el. incremental retrain incremental retrain1 transformation evolving evolving shock shock2 feedback evolving shock evolving shock

no prob. small prob. problem prob if not sychron.

Page 27: Adaptive pre-processing for streaming data

Challenges

raw data

prediction

Predictionelement

Pre-processingelement

1. transformed data2. feedback

adaptive mode ofPre-processing el. incremental incremental retrain retrainPredictive el. incremental retrain incremental retrain1 transformation evolving evolving shock shock2 feedback evolving shock evolving shock

no prob. small prob. problem prob if not sychron.

Page 28: Adaptive pre-processing for streaming data

Research questions for adaptive pre-processing

● How to decide● when to adapt pre-processing and when to adapt predictor

● How to integrate● adaptivity of two elements when pre-processing complely

transforms the input space (PCA)● How to handle

● the `shock' of new pre-processing output in the incremental learning mode

● How to monitor and detect● the need for adapting the pre-processing element in very

high dimensional spaces

Page 29: Adaptive pre-processing for streaming data

Some experimental evidence

Page 30: Adaptive pre-processing for streaming data

Case study● 2,5 years of data, readings every 5 min● 86 sensors (features), ~170 th. instances ● classification problem

Page 31: Adaptive pre-processing for streaming data

Strategies● Strategies with fixed training windows

● old-old, old-new, new-old, new-new

● Online strategy selection adaptive pre-processing

Page 32: Adaptive pre-processing for streaming data

Results● Naive Bayes, SVM, CART tree● e.g. NB: online strategy selected

● old-old 58% of times, old-new 15%, new-old 17% and new-new 10%

● it means decoupling is useful

Page 33: Adaptive pre-processing for streaming data

Conclusion

Page 34: Adaptive pre-processing for streaming data

Conclustion● If we want to automate online learning, we need

to automate pre-processing as well● Decoupling of adaptivities may be necessary if

different modes of adaptivity are used● Decoupling may be beneficial to accuracy due

to different amounts of training data required● Experiments with synthetic and real data show

that there is a room for adaptive (decoupled) pre-processing

Page 35: Adaptive pre-processing for streaming data

Thanks

2 PhD studentships are availablecontact: [email protected]

Page 36: Adaptive pre-processing for streaming data

AcknowledgementsPart of the research leading to these results has received funding from the EC within the Marie Curie Industry and Academia Partnerships and Pathways (IAPP) programme under grant agreement no. 251617.