1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic...

58
1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control

Transcript of 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic...

Page 1: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

1

Introduction to Predictive Learning

Electrical and Computer Engineering

LECTURE SET 2

Basic Learning Approaches and Complexity Control

Page 2: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

2

OUTLINE

2.0 Objectives

2.1 Terminology and Basic Learning Problems

2.2 Basic Learning Approaches

2.3 Generalization and Complexity Control

2.4 Application Example

2.5 Summary

Page 3: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

3

2.0 Objectives1. To quantify the notions of

explanation, prediction and model

2. Introduce terminology

3. Describe basic learning methods

• Past observations ~ data points

• Explanation (model) ~ function

Learning ~ function estimation

Prediction ~ using estimated model to make predictions

Page 4: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

4

2.0 Objectives (cont’d)• Example: classification

training samples, model

Goal 1: explanation of training data

Goal 2: generalization (for future data)

• Learning is ill-posed

Page 5: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

5

Learning as Induction

Induction ~ function estimation from data:

Deduction ~ prediction for new inputs:

Page 6: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

6

2.1 Terminology and Learning Problems

• Input and output variables

System

xy

z

x

* * *

* **

**

y

* * *

*

* ** *

**

*

• Learning ~ estimation of F(X): Xy

• Statistical dependency vs causality

Page 7: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

7

2.1.1 Types of Input and Output Variables

• Real-valued

• Categorical (class labels)

• Ordinal (or fuzzy) variables

• Aside: fuzzy sets and fuzzy logic

Me

mbe

rsh

ip v

alu

e

Weight (lbs)

75 100 125 150 175 200 225

LIGHT MEDIUM HEAVY

Page 8: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

8

Data Preprocessing and Scaling• Preprocessing is required with observational data

(step 4 in general experimental procedure)

Examples: ….• Basic preprocessing includes

- summary univariate statistics: mean, st. deviation, min + max value, range, boxplot performed independently for each input/output

- detection (removal) of outliers

- scaling of input/output variables (may be required for some learning algorithms)

• Visual inspection of data is tedious but useful

Page 9: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

9

Example Data Set: animal body&brain weight

kg gram1 Mountain beaver 1.350 8.100

2 Cow 465.000 423.000

3 Gray wolf 36.330 119.500

4 Goat 27.660 115.000

5 Guinea pig 1.040 5.500

6 Diplodocus 11700.000 50.000

7 Asian elephant 2547.000 4603.000

8 Donkey 187.100 419.000

9 Horse 521.000 655.000

10 Potar monkey 10.000 115.000

11 Cat 3.300 25.600

12 Giraffe 529.000 680.000

13 Gorilla 207.000 406.000

14 Human 62.000 1320.000

Page 10: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

10

Example Data Set: cont’dkg gram

15 African elephant 6654.000 5712.000

16 Triceratops 9400.000 70.000

17 Rhesus monkey 6.800 179.000

18 Kangaroo 35.000 56.000

19 Hamster 0.120 1.000

20 Mouse 0.023 0.400

21 Rabbit 2.500 12.100

22 Sheep 55.500 175.000

23 Jaguar 100.000 157.000

24 Chimpanzee 52.160 440.000

25 Brachiosaurus 87000.000 154.500

26 Rat 0.280 1.900

27 Mole 0.122 3.000

28 Pig 192.000 180.000

Page 11: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

11

Original Unscaled Animal Data: what points are outliers?

Page 12: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

12

Animal Data: with outliers removed and scaled to [0,1] range: humans in the left top corner

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Body weight

Bra

in w

eig

ht

Page 13: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

13

2.1.2 Supervised Learning: Regression• Data in the form (x,y), where

- x is multivariate input (i.e. vector)

- y is univariate output (‘response’)• Regression: y is real-valued

Estimation of real-valued function xy

-0.5

0

0.5

1

1.5

0 0.2 0.4 0.6 0.8 1

Page 14: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

14

2.1.2 Supervised Learning: Classification• Data in the form (x,y), where

- x is multivariate input (i.e. vector)

- y is univariate output (‘response’)• Classification: y is categorical (class label)

Estimation of indicator function xy

Page 15: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

15

2.1.2 Unsupervised Learning

• Data in the form (x), where

- x is multivariate input (i.e. vector)

• Goal 1: data reduction or clustering

Clustering = estimation of mapping X c

Page 16: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

16

Unsupervised Learning (cont’d)

• Goal 2: dimensionality reduction

Finding low-dimensional model of the data

Page 17: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

17

2.1.3 Other (nonstandard) learning problems

• Multiple model estimation:

Page 18: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

18

OUTLINE2.0 Objectives

2.1 Terminology and Learning Problems

2.2 Basic Learning Approaches

- Parametric Modeling

- Non-parametric Modeling

- Data Reduction

2.3 Generalization and Complexity Control

2.4 Application Example

2.5 Summary

Page 19: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

19

2.2.1 Parametric Modeling

Given training data

(1) Specify parametric model

(2) Estimate its parameters (via fitting to data)• Example: Linear regression F(x)= (w x) + b

minbyn

iii

2

1

)( xw

niyii ,...2,1),,( x

Page 20: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

20

Parametric ModelingGiven training data

(1) Specify parametric model

(2) Estimate its parameters (via fitting to data)

Univariate classification:

niyii ,...2,1),,( x

Page 21: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

21

2.2.2 Non-Parametric ModelingGiven training data

Estimate the model (for given ) as

‘local average’ of the data.

Note: need to define ‘local’, ‘average’• Example: k-nearest neighbors regression

k

y

f

k

jj

10 )(x

niyii ,...2,1),,( x

0x

Page 22: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

22

2.2.3 Data Reduction Approach

Given training data, estimate the model as ‘compact encoding’ of the data.

Note: ‘compact’ ~ # of bits to encode the model• Example: piece-wise linear regression

How many parameters needed

for two-linear-component model?

Page 23: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

23

Example: piece-wise linear regression vs linear regression

0 0.2 0.4 0.6 0.8 1-1

-0.5

0

0.5

1

1.5

y

x

Page 24: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

24

Data Reduction Approach (cont’d)

Data Reduction approaches are commonly used for unsupervised learning tasks.

• Example: clustering.

Training data encoded by 3 points (cluster centers)

H

Issues:- How to find centers?- How to select the

number of clusters?

Page 25: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

25

Inductive Learning Setting

Induction and Deduction in Philosophy:All observed swans are white (data samples).Therefore, all swans are white.• Model estimation ~ inductive step, i.e. estimate

function from data samples.• Prediction ~ deductive step Inductive Learning Setting• Discussion: which of the 3 modeling

approaches follow inductive learning?• Do humans implement inductive inference?

Page 26: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

26

OUTLINE2.0 Objectives

2.1 Terminology and Learning Problems

2.2 Modeling Approaches & Learning Methods

2.3 Generalization and Complexity Control

- Prediction Accuracy (generalization)

- Complexity Control: examples

- Resampling2.4 Application Example

2.5 Summary

Page 27: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

27

2.3.1 Prediction AccuracyInductive Learning ~ function estimation• All modeling approaches implement ‘data

fitting’ ~ explaining the data• BUT True goal ~ prediction• Two possible goals of learning:

- estimation of ‘true function’- good generalization for future data

• Are these two goals equivalent?• If not, which one is more practical?

Page 28: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

28

Explanation vs Prediction

(a) Classification (b) Regression

Page 29: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

29

Inductive Learning Setting• The learning machine observes samples (x ,y), and

returns an estimated response

• Recall ‘first-principles’ vs ‘empirical’ knowledge

Two modes of inference: identification vs imitation• Risk

),(ˆ wfy x

min,y),w)) dP(Loss(y, f( xx

Page 30: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

30

Discussion• Math formulation useful for quantifying

- explanation ~ fitting error (training data)- generalization ~ prediction error

• Natural assumptions- future similar to past: stationary P(x,y), i.i.d.data- discrepancy measure or loss function, i.e. MSE

• What if these assumptions do not hold?

Page 31: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

31

Example: RegressionGiven: training data

Find a function that minimizes squared

error for a large number (N) of future samples:

BUT Future data is unknown ~ P(x,y) unknown

-0.5

0

0.5

1

1.5

0 0.2 0.4 0.6 0.8 1

min,y) dP(,w))f((y xx 2

minwfy kk

N

k

2

1

)],([( x

niyii ,...2,1),,( x),( wf x

Page 32: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

32

2.3.2 Complexity Control: parametric modelingConsider regression estimation• Ten training samples

• Fitting linear and 2-nd order polynomial:25.0),,0( 222 whereNxy

Page 33: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

33

Complexity Control: local estimationConsider regression estimation• Ten training samples from

• Using k-nn regression with k=1 and k=4:25.0),,0( 222 whereNxy

Page 34: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

34

Complexity Control (cont’d)

• Complexity (of admissible models) affects generalization (for future data)

• Specific complexity indices for– Parametric models: ~ # of parameters– Local modeling: size of local region– Data reduction: # of clusters

• Complexity control = choosing good complexity (~ good generalization) for a given (training) data

Page 35: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

35

How to Control Complexity ?• Two approaches: analytic and resampling • Analytic criteria estimate prediction error as a

function of fitting error and model complexityFor regression problems:

Representative analytic criteria for regression• Schwartz Criterion:

• Akaike’s FPE:

where p = DoF/n, n~sample size, DoF~degrees-of-freedom

empRn

DoFrR

nppnpr ln11, 1

r p 1 p 1 p 1

Page 36: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

36

2.3.3 Resampling• Split available data into 2 sets:

Training + Validation(1) Use training set for model estimation (via data fitting)(2) Use validation data to estimate the prediction error of the model

• Change model complexity index and repeat (1) and (2)

• Select the final model providing lowest (estimated) prediction error

BUT results are sensitive to data splitting

Page 37: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

37

K-fold cross-validation

1. Divide the training data Z into k randomly selected disjoint subsets {Z1, Z2,…, Zk} of size n/k

2. For each ‘left-out’ validation set Zi :

- use remaining data to estimate the model

- estimate prediction error on Zi :

3. Estimate ave prediction risk as

)(ˆ xify

2)(

i

yfn

kr ii

Z

x

k

iicv r

kR

1

1

Page 38: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

38

Example of model selection(1)• 25 samples are generated as

with x uniformly sampled in [0,1], and noise ~ N(0,1)• Regression estimated using polynomials of degree m=1,2,…,10• Polynomial degree m = 5 is chosen via 5-fold cross-validation. The curve shows the

polynomial model, along with training (* ) and validation (*) data points, for one partitioning.

m Estimated R via Cross validation

1 0.1340

2 0.1356

3 0.1452

4 0.1286

5 0.0699

6 0.1130

7 0.1892

8 0.3528

9 0.3596

10 0.4006

xy 22sin

Page 39: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

39

Example of model selection(2)• Same data set, but estimated using k-nn regression.• Optimal value k = 7 chosen according to 5-fold cross-validation

model selection. The curve shows the k-nn model, along with training (* ) and validation (*) data points, for one partitioning.

k Estimated R via Cross validation

1 0.1109

2 0.0926

3 0.0950

4 0.1035

5 0.1049

6 0.0874

7 0.0831

8 0.0954

9 0.1120

10 0.1227

Page 40: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

40

More on Resampling• Leave-one-out (LOO) cross-validation

- extreme case of k-fold when k=n (# samples)- efficient use of data, but requires n estimates

• Final (selected) model depends on:- random data- random partitioning of the data into K subsets (folds) the same resampling procedure may yield different model selection results

• Some applications may use non-random splitting of the data into (training + validation)

• Model selection via resampling is based on estimated prediction risk (error).

• Does this estimated error measure reflect true prediction accuracy of the final model?

Page 41: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

41

Resampling for estimating true risk

• Prediction risk (test error) of a method can be also estimated via resampling

• Partition the data into: Training/ validation/ test• Test data should be never used for model

estimation• Double resampling method:

- for complexity control- for estimating prediction performance of a method

• Estimation of prediction risk (test error) is critical for comparison of different learning methods

Page 42: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

42

Example of model selection for k-NN classifier via 6-fold x-validation: Ripley’s data.Optimal decision boundary for k=14

-1.5 -1 -0.5 0 0.5 1-0.2

0

0.2

0.4

0.6

0.8

1

1.2

Page 43: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

43

Example of model selection for k-NN classifier via 6-fold x-validation: Ripley’s data.Optimal decision boundary for k=50

which one

is better?

k=14 or 50

-1.5 -1 -0.5 0 0.5 1-0.2

0

0.2

0.4

0.6

0.8

1

1.2

Page 44: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

44

Estimating test error of a method• For the same example (Ripley’s data) what is the true

test error of k-NN method ?• Use double resampling, i.e. 5-fold cross validation to

estimate test error, and 6-fold cross-validation to estimate optimal k for each training fold:

Fold # k Validation Test error1 20 11.76% 14%2 9 0% 8%3 1 17.65% 10%4 12 5.88% 18%5 7 17.65% 14%

mean 10.59% 12.8%• Note: opt k-values are different; errors vary for each fold,

due to high variability of random partitioning of the data

Page 45: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

45

Estimating test error of a method• Another realization of double resampling, i.e. 5-fold

cross validation to estimate test error, and 6-fold cross-validation to estimate optimal k for each training fold:

Fold # k Validation Test error1 7 14.71% 14%2 31 8.82% 14%3 25 11.76% 10%4 1 14.71% 18%5 62 11.76% 4%

mean 12.35% 12%

• Note: predicted average test error (12%) is usually higher than minimized validation error (11%) for model selection

Page 46: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

46

2.4 Application Example

• Why financial applications?- “market is always right” ~ loss function- lots of historical data - modeling results easy to understand

• Background on mutual funds • Problem specification + experimental setup• Modeling results • Discussion

Page 47: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

47

OUTLINE

2.0 Objectives

2.1 Terminology and Basic Learning Problems

2.2 Basic Learning Approaches

2.3 Generalization and Complexity Control

2.4 Application Example

2.5 Summary

Page 48: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

48

2.4.1 Background: pricing mutual funds

• Mutual funds trivia and recent scandals• Mutual fund pricing:

- priced once a day (after market close) NAV unknown when order is placed

• How to estimate NAV accurately? Approach 1: Estimate holdings of a fund (~200-400 stocks), then find NAVApproach 2: Estimate NAV via correlations btwn NAV and major market indices (learning)

Page 49: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

49

2.4.2 Problem specs and experimental setup

• Domestic fund: Fidelity OTC (FOCPX)• Possible Inputs:

SP500, DJIA, NASDAQ, ENERGY SPDR• Data Encoding:

Output ~ % daily price change in NAV

Inputs ~ % daily price changes of market indices

• Modeling period: 2003.

• Issues: modeling method? Selection of input variables? Experimental setup?

Page 50: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

50

Experimental Design and Modeling Setup

Mutual FundsMutual Funds Input VariablesInput Variables

YY X1X1 X2X2 X3X3

FOCPXFOCPX ^IXIC^IXIC -- --

FOCPXFOCPX ^GSPC^GSPC ^IXIC^IXIC --

FOCPXFOCPX ^GSPC^GSPC ^IXIC^IXIC XLEXLE

• All variables represent % daily price changes.• Modeling method: linear regression• Data obtained from Yahoo Finance.• Time period for modeling 2003.

Possible variable selection:

Page 51: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

51

Specification of Training and Test Data

Year 2003

1, 2 3, 4 5, 6 7, 8 9, 10 11, 12

Training Test

Training Test

Training Test

Training Test

Training Test

Two-Month Training/ Test Set-up Total 6 regression models for 2003

Page 52: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

52

Results for Fidelity OTC Fund (GSPC+IXIC)

Coefficients w0 w1 (^GSPC) W2(^IXIC)

Average -0.027 0.173 0.771

Standard Deviation (SD) 0.043 0.150 0.165

Average model: Y =-0.027+0.173^GSPC+0.771^IXIC^IXIC is the main factor affecting FOCPX’s daily price change Prediction error: MSE (GSPC+IXIC) = 5.95%

Page 53: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

53

Results for Fidelity OTC Fund (GSPC+IXIC)

Daily closing prices for 2003: NAV vs synthetic model

80

90

100

110

120

130

140

1-Jan-03 20-Feb-03 11-Apr-03 31-May-03 20-Jul-03 8-Sep-03 28-Oct-03 17-Dec-03

Date

Daily A

cco

un

t V

alu

e

FOCPX

Model(GSPC+IXIC)

Page 54: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

54

Average Model: Y=-0.029+0.147^GSPC+0.784^IXIC+0.029XLE ^IXIC is the main factor affecting FOCPX daily price change Prediction error: MSE (GSPC+IXIC+XLE) = 6.14%

Coefficients w0 w1 (^GSPC) W2(^IXIC) W3(XLE)

Average -0.029 0.147 0.784 0.029

Standard Deviation (SD) 0.044 0.215 0.191 0.061

Results for Fidelity OTC Fund (GSPC+IXIC+XLE)

Page 55: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

55

Results for Fidelity OTC Fund (GSPC+IXIC+XLE)

Daily closing prices for 2003: NAV vs synthetic model

80

90

100

110

120

130

140

1-Jan-03 20-Feb-03 11-Apr-03 31-May-03 20-Jul-03 8-Sep-03 28-Oct-03 17-Dec-03

Date

Da

ily

Ac

co

un

t V

alu

e

FOCPX

Model(GSPC+IXIC+XLE)

Page 56: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

56

Effect of Variable Selection

Different linear regression models for FOCPX:• Y =-0.035+0.897^IXIC

• Y =-0.027+0.173^GSPC+0.771^IXIC• Y=-0.029+0.147^GSPC+0.784^IXIC+0.029XLE• Y=-0.026+0.226^GSPC+0.764^IXIC+0.032XLE-0.06^DJI

Have different prediction error (MSE):• MSE (IXIC) = 6.44%• MSE (GSPC + IXIC) = 5.95%• MSE (GSPC + IXIC + XLE) = 6.14%• MSE (GSPC + IXIC + XLE + DJIA) = 6.43%

(1) Variable Selection is a form of complexity control

(2) Good selection can be performed by domain experts

Page 57: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

57

Discussion• Many funds simply mimic major indices statistical NAV models can be used for

ranking/evaluating mutual funds

• Statistical models can be used for

- hedging risk and

- to overcome restrictions on trading (market timing) of domestic funds

• Since 70% of the funds under-perform their benchmark indices, better use index funds

Page 58: 1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 2 Basic Learning Approaches and Complexity Control.

58

Summary• Inductive Learning ~ function estimation • Goal of learning (empirical inference):

to act/perform well, not system identification • Important concepts:

- training data, test data- loss function, prediction error (aka risk)- basic learning problems- basic learning methods

• Complexity control and resampling• Estimating prediction error via resampling