IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for...

36
IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section [email protected]

Transcript of IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for...

Page 1: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

IRENE_DLL: a library to evaluate model performance

Presented by:

Gianni Fila

Research Institute for Industrial Crops

Agronomy Section

[email protected]

Page 2: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.
Page 3: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

Topics

1. Background

2. Overview

3. Using the library: basic concepts

4. Special procedures

5. Application examples

6. Future developments

Page 4: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

1. Background

Page 5: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

Need of reliable model estimates No standard theory on model evaluation No standard “boxes of tools” Plethora of philosophical theories, statistical

techniques, and software practices

Why IRENE_DLL?

IRENE_DLL: a set of tools all housed

in a single, integrated component

Page 6: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

What IRENE_DLL can do Difference-based analysis: indices, test statistics Regression-based analysis: parameters, test

statistics Patterns detection: “Pattern Index” Probability distributions: density functions, cumulative

distributions (exceedence, non-exceedence) Aggregation: first level (“module”), second level

(“indicator”) “Shift” analysis (Time mismatch)

Page 7: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

2. Overview

Page 8: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

IRENE_DLL in a nutshell

IRENE_DLL is a library of methods and functions to compute a variety of statistics and statistical tests

It consists of ten classes, containing data services, mathematical routines and some special data analysis procedures

All IRENE_DLL classes can be accessed individually (no hierarchy)

Page 9: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

OverviewData objects

DataSelections

DataSelection

Aggregation objects

Module

Indicator

Accessory objects

GeneralRoutines

They store data to be processed, and expose properties to handle them.

Each object holds a group of related functions.

They contain methods to perform statistics aggregation

Info-display routines

Computing objects

Pattern

DistributionFunctions

RegressionObject

IndexObject

Tests

Page 10: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

3. Using the DLL: basic concepts

Page 11: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

Handling data (1) To be accepted by any functions, data

(estimated and measured) must be loaded into a DataSelection object

External data

E

M

DataSelectionestimatedmeasured(independent)

Page 12: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

Handling data (2) Whenever multiple series of data are to be

processed, it is convenient to use a collective DataSelections object:

External data

E

M

External data

E

M

External data

E

M

DataSelection 1

DataSelection 2

DataSelection 3

DataSelections•DataSelection 1•DataSelection 2•DataSelection 3

Page 13: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

Handling functions To use a particular function, you must call it from the

parent object, then pass it the data (in the DataSelection format), and the necessary specifications:

DataSelection

Required Specs

Outputs

Computing objectFunction_1Function_2(….)Function_n

Outputs from functions are aggregated in a single, collective variable

Page 14: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

Function modes

69.77 75.34 111.47 121.37 56.35 61.10 95.39 106.78

68.42 73.20 101.39 118.49 55.07 60.06 94.13 104.67

49.99 68.04 76.29 110.42 50.96 56.65 124.47 96.22

58.29 68.71 110.54 116.12 38.11 56.69 113.70 101.71

60.82 69.00 140.03 126.63 64.29 56.10 86.97 111.57

62.82 61.63 93.86 126.51 50.45 47.40 144.10 110.56

50.45 22.76 128.99 63.37 19.44 22.76 104.14 61.58

21.74 19.47 63.94 101.98 25.11 23.56 106.07 75.32

29.52 21.18 67.04 93.55 33.23 21.91 41.30 85.27

24.49 23.78 79.00 87.73 26.88 20.69 35.18 62.65

21.21 24.82 66.69 81.78 43.64 20.77 68.15 52.87

30.62 25.95 68.86 68.38 35.23 20.61 34.34 45.02

27.78 26.44 36.31 53.70 24.29 19.78 33.52 31.82

16.42 26.45 61.95 58.00 26.84 17.80 22.25 21.54

20.86 24.32 100.58 77.86 23.47 22.33 27.85 25.08

19.20 30.60 95.27 60.17 32.72 28.22 69.01 46.40

21.12 39.19 163.19 107.48 23.83 23.62 124.35 46.42

Paired_ColumnsPaired_RowsUnpaired_AverageUnpaired_One_to_One

Page 15: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

Handling outputs

All functions in IRENE_DLL return a package of outputs

A special collective variable is designed for each type of function

Example: content of the Index_Variable:

Index_Variable Value() Double StandardError Double ConfInterval Double mean Double t_Student Double UpperLimit Double LowerLimit Double Note String

Page 16: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

Example: computing regression

Compute regression parameters for three arrays of estimated data against three corresponding arrays of measured data

Array MyEstimated(1 to 365, 1 to 3)

Array MyMeasured(1 to 365, 1 to 3)

Page 17: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

Example: (1) load data

Start an instance of a DataSelection object, and transfer your data inside it through the Estimated and Measured properties

Dim Nitrates As New DataSelection

Loop through:

Nitrates.Estimated(i, j) = MyEstimated(i, j)

Nitrates.Measured(i, j) = MyMeasured(i, j)

Page 18: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

Example: (2) Compute Regression

Start an instance of the RegressionObject, then call the function Regression_LS (least squares method)

Dim RegrCalculator As New RegressionObject

Dim RegrOutput As Regression_Variable

RegrOutput = RegrCalculator.Regression_LS(Nitrates, Paired_Columns, Measured_Variable)

Page 19: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

Example: (3) Display results

Outputs are returned as arrays of valuesLoop from j = 1 to 3 (the number of columns)

Print RegrOutput.Intercept(j)      Print RegrOutput.Intercept_StandError(j)   Print RegrOutput.Intercept_Tvs0(j)Print RegrOutput.Intercept_Prob_Tvs0(j)       Print RegrOutput.Slope(j)  Print RegrOutput.Slope_StandError(j)Print RegrOutput.Slope_Tvs0(j)                 Print RegrOutput.Slope_Tvs1(j) Print RegrOutput.Slope_Prob_Tvs0(j)Print RegrOutput.Slope_Prob_Tvs1(j)          Print RegrOutput.F(j)Print RegrOutput.Prob_F(j)

Page 20: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

4. Special procedures

Page 21: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

Aggregation of indices: the problem

The need to define synthetic measures of model performance is a major topic of interest in the field of model evaluation.

Giving a solid judgement is often complicated by the need to balance, for instance, departure of estimates from measurements, modelling efficiency, correlation measures, presence/absence of systematic behavior in the residuals, etc.

The user may want to combine all such aspects in only one aggregated index.

Page 22: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

IRENE’s approach

Index/test aggregation is set up by IRENE, based on an expert system using decision rules, according to fuzzy logic

This technique is robust when uncertain data are used (e.g. subjective judgements) and allows an aggregation of dissimilar statistics in a consistent and reproducible way.

Page 23: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

Aggregation levels hyerarchy

IRENE_DLL supports two levels of statistics aggregation:

First level: multiple indices/tests aggregated in one single index (Module)

Second level: Multiple modules aggregated in one single index (Indicator)

Page 24: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

Module specifications

Aggregation of statistics require the user to introduce specifications derived from his/her expert knowledge, and objectives

For each index to aggregate, the user must specify:

• The DataSelection• The Favourable and Unfavourable limit values• The relative weight

Page 25: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

MyModule

How to build a Module

Add indices to aggregate, with the required specifications

r_Pearson, DataSelection, 0.95, 0.90, 1Add_Index: RMSE, DataSelection, 0.1, 0.8, 1Add_Index: Pattern, DataSelection, 0.2, 0.75, 0.5Add_Index:

Finally, compute module value calling the function Module_Value

Module_Value(a score between 0 and 1)RMSE

r_PearsonPattern

Start an instance of the Module object

Page 26: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

Time mismatch analysis There are no specific indices to investigate the

uncertainty about possible displacements (delay or acceleration) registered in time series.

A time mismatch may be detected by an iterated procedure, “shifting” repeatedly the estimated points until optimal model performance values are found.

0

2

4

6

8

10

12

14

16

18

20

15/07/1992 31/01/1993 19/08/1993 07/03/1994 23/09/1994 11/04/1995 28/10/1995 15/05/1996

Estimated

Measured

Soi

l NH

4 (m

g kg

-1)

Page 27: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

Finding a time mismatch

Starting from an observed model performance as initial condition, the procedure runs as follows:

1. the simulated points are moved backward in time to the maximum anticipation chosen to evaluate the time mismatch

2. the desired evaluation indices are computed;3. the simulated points of one time history are moved of one selected

time step forward;4. points 2 to 3 are reiterated until the maximum allowed time delay is

reached.

The time mismatch is identified by the time step (forward or backward) at which the best value of the evaluation indices are reached.

Page 28: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

Handling time mismatch in IRENE_DLL

All functions exposed by IRENE_DLL have an optional ShiftN parameter:

Ex., in the IndexObject object:– Public Function RMSE(Sel As DataSelection, Mode

As Mode, [ByVal ShiftN As Long = 0], [ByVal Prob_level As Double = 0.05]) As Index_Variable

By setting a ShiftN value <> 0, the index is computed shifting the estimated points.

We can explore a range of ShiftN values by a loop

Page 29: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

5. Application examples

Page 30: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

Time mismatch analysis

IRENE_DLL, sample applications

Difference-based indices

Pattern Index

Page 31: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

6. Future developments

Page 32: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

In the next future:

Integration in IRENE interface Integration in development frameworks (MODCOM) Introduction of robust statistics (median-based). Introduction of randomization procedures

(“bootstrap”) (….)

Page 33: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

How to get IRENE_DLL

View documentation

Page 34: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.
Page 35: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

The people of IRENE_DLL Gianni Fila [email protected]

– Research Institute for Industrial Crops

Gianni Bellocchi [email protected]– Research Institute for Industrial Crops

Marcello Donatelli [email protected]– Research Institute for Industrial Crops

Marco Acutis [email protected]– Department of Crop Science, University of Milan

Page 36: IRENE_DLL: a library to evaluate model performance Presented by: Gianni Fila Research Institute for Industrial Crops Agronomy Section g.fila@isci.it.

Thank you!