CSC 599: Computational Scientific Discovery

CSC 599: Computational Scientific Discovery

Lecture 8: Lagramge (sp) and Inductive

Process Modeling

Outline

Review Processes BACON Other equation finders The act of equation finding

Lagramge (sp)

Inductive Process Modeling

Review: Processes

Deal with changes over time Range

(start, finish), (start, duration) Rates of change

dx/dt Previous history

attribute[event[t]] = f(event[1], event[2], . . . event[t-1])

Attributes of processes (when time is included) Quantity of “always on” forces as function of time

Gravity, electro-magnetism Maximum limit of “homeostatic” forces

Friction, Normal force Misc. changes during process

Review: BACON

BACON was driven by the data and domain knowledge: Distinguishing between independent and

dependent attribute BACON 3

Discovery of intrinsic properties of conceptual values

BACON 4 Preference for symmetric equations given

knowledge that suggests one could exist BACON 5

Equation Finders, 1990s

[From Ljupco Todorovski's Homepage:] http://www-ai.ijs.si/~ljupco/ COPER [Kokar 1986]

Uses information about the dimension units of the system variables to restrict the space of possible equation structures.

BACON [Langley 1987] Pioneer among equation discovery systems. It uses a

set of data-driven heuristics for finding regularities (constancies and trends) in data and for formulating hypotheses based on them.

Equation Finders, 1990s (2)

Fahrenheit/EF[Langley and Zytkow 1989], [Zembowitz and Zytkow 1992]

Used as a equation discovery subsystem of the scientific discovery system. For discovering bivariate equations only, user being able to specify the set of operators and functions to be used within equations.

ABACUS [Falkenhainer and Michalski 1990] Experiments with different search strategies through

the space of equation structures. Also allows discovery of piecewise equations using clustering for identifying the limits between pieces.

IDS [Nordhausen and Langley 1990] ARC [Moulet 1992]

Equation Finders, 1990s (3)

E* [Schaffer 1993] Discovers of bivariate equations using a small set of

predefined equation structures. LAGRANGE [Dzeroski and Todorovski 1995]

Handled differential equations. GOLDHORN [Krizman et al. 1995]

Extended LAGRANGE towards discovery from noisy data.

SDS [Washio and Motoda 1997] Used information about scale types of the dimension

units of the system variables to restrict the space of possible equations.

LAGRAMGE [Todorovski and Dzeroski 1997] Allowed the user to specify the space of possible

equations with context free grammar.

Issues with equation finders

1. How to incorporate timeEsp. derivatives

2.How separate Heuristics used to search eqn space Grading fnc used to determine how well eqn fits:

Compare: True function = x2

Current function = -x2

Large “error” in numeric space Small “error” in symbolic (eqn space)

3. How best to incorporate domain knowledge? BACON programs did so, but was ad hoc

Lagrange:

The Person Italian/French Mathematician and physicist 1736-1813

The System Mid-1990s equation finder Found differential equations Dzeroski and Todorovski 1995

Lagramge (sp)

Lagramge = Lagrange (system) + grammar

Handling the issues:1. How to account for time?

Each vector of variable values can represent the state of the system at a particular time

Throw both attr and d[attr]/dt in variable set2. Separating

a) Heuristics for searching equation spaceExplicit user-defined function for judging fnc similiarity

b) Gauging equation fit Sum of squared errors

Lagramge, top algorithm

void lagramge(Grammar g, int maxComplexity, int maxBeamWidth, fnc fittingFnc(), fnc eqnPreferFnc(), fnc stoppingCriteriaMet()){T0 = init symbolQ1 = {T0}do { Q = Q1 R = Q.generateRefinements(); foreach r in R { r.calcBestFitConstants(fittingFnc); r.symbolicCloseness = eqnPreferFnc.calc(r); } Q1 = union(Q,R); Q1.keepBest(maxBeamWidth,symbolicCloseness); }while ( (Q != Q1) && !stoppingCriteriaMet() );}

Heuristic functions:

fnc fittingFnc()Job = given this equation parse tree, find best fitting

parametersattr

learn = c

1*attr

1 + c

2*attr

2 + c

3* attr

1*attr

2 + c

4

Distinguish between: space of equation parse trees (countable) space of all the floating point parameters (in principle,

uncountable)

fnc eqnPreferFnc()Job = given this instantiated equation, how well does

it fit the data?Traditional least squares, or some variant

Noise tolerant

Lagramge Handling Issues (2)3. Incorporation of domain knowledge

Use a grammar!Inputs:

1. Context free grammar(More on next slide)

2. Input data D = (V,vd,M)

V = set of variablesv

d = variable to predict in V

M = measurements of all vars:time v

1v

2. . . v

n

t0

v1,0

v2,0

vn,0

t1

v1,1

v2,1

vn,1

. . .

Outputs:Ordinary or differential equation

Grammar

Context Free Grammar =<nonTerminalSet, terminalSet, productionSet, startSym>

Ex:double monod(double c, double v) { return(v/(v+c)); }N = {E, F, M, v}T = {+, const, *, monod, (“,”), N, P, Z}P = { E -> const | const*F | E + const*F F -> v | M | v*M M -> monod(const,v) }S = E

Tree's “Height” as its complexity

The deeper the tree, the more complex the expression

y = f1(x), y = f1(f2(x)), y = f1(f2(f3(x)))

Sym Height = shallowest height of deepest prod.p = A -> A

1,A

2, . . .A

l

h(p) = 1 + maxi {h(A

i)}

h(A) = if (isNonTerminal(A)) minq in prods of A

{h(q)} else /* isTerminal(A) */ 0

height, production1 E->const; v->N; v->P; V->Z2 M->monod(const,v); F->v3 F->M; F->v*M; E->const*F; E->E+const*F

Lagramge Refinement

void refinement (Tree t, int maxHeight){choose A, nonterminal in Tp

A,i = production applied at that node

l = pathLength from root T to Adelete subtrees of Aif ( (i+1) <= A.numProductionsFor()) && (l + height(p

A,i+1) <= maxHeight) ) {

pA,i+1

= A->A1,A

2, . . . A

l

replace pA,i

with pA,i+1

expand A with successors A1,A

2, . . . A

l

do { choose nonTerm leaf B if T p

B,1 = B->B

1,B

2, . . . B

m

expand B with successors B1,B

2, . . . B

m

} while ( T.hasNonTerminals() );}

Refinement Example

Lagramge Findings (1):

Aquatic Ecosystems: N = concentration of nutrient P = concentration of phytoplankton Z = concentration of zooplankton

dN/dt = -NP/(kN + N)

dP/dt = NP/(kN + N) – r

PP – PZ(k

P+P)

dZ/dt = PZ(kP+P) - rZZ

Lagramge Findings (2):

Two poles on a cart (inverted pendulum):

Lagramge Discussion

Problems it solves Handling noisy data

sum of squared error Handling of time and derivatives

Derivatives are just another variable Nothing special to do on user's part

Distinguishes between error in symbol space and error in numeric value space

Heuristic fnc for 1st, grading fnc for 2nd

Incorporation of domain knowledge Symbol space heuristic fnc Grammar

Lagramge Discussion (2)

Hold up one second! Do you believe this?Incorporation of domain knowledge

Grammar

Grammer ?!? Computer scientists think in terms of grammars

FSM/regular expression PDA/context free Turing Machines/context sensitive

Do scientists think in terms of grammars?(Besides linguists, of course)

Lagramge Discussion (3)

Remember BACON 5 heuristic: If have structural knowledge that suggests a

symmetric equation than look for one first Can we do that with Lagramge?

Grammar for symmetry(?)A -> BCB (not quite, both B's are no distinct)A -> c*B + c*D (symmetric but not that powerful)A -> (B) (ditto)

Preference for symmetry Jury-rig eqnPreferFnc() as needed Are you happy with that?

Inductive Process Modeling

IPM

We want it all! Processes

Explicitly recognized objects (in software engineering sense)

Distinguish between instances and classes higher level constructs with uninstantiated equations Simulation equations constructed from process ones Automatically deals with time

Exhaustive search To some maximal complexity Better fitting function for time series data

IPM Generic ProcessesLibrary pred_preygeneric process logistic_growth;

variables S[species];parameters gr[0,3], ic[0,0,1];equations d[S,t,1] = gr * S * (1-ic*S)

generic process exponential_growth;variables S{species};parameters gr[0,3];equations d[S,t,1] = gr * S;

generic process exponential_decay;variables S{species};parameters dr[0,2];equations d[S,t,1] = -1*dr * S;

generic process holling_1;variables S1{prey}, S2{predator};parameters ar[0.01,10],

ef[0.001,0.8];equations d[S1,t,1] = -1 * ar * S1 *

S2;d[S2,t,1] = ef * ar * S1 * S2

IPM Quantitative Processes:

Model PredatorPrey:vars: aurelia{prey}, nasutum{predator};observable: aurelia, nasutum;process aurelia_growth;

equations d[aurelia,t,1] = 1.81 * aurelia * (1-0.0003*aurelia);process nasutum_decay

equations d[nasutum,t,1] = -1 * 1.04 * nasutum;process predation_holling_t:

equations d[aurelia,t,1] = -1 * 0.03 * aurelia * nasutum; d[nasutum,t,1] = 0.30 * 0.03 * aurelia * nasutum;

d[aurelia]/dt = 1.81 * aurelia * (1-0.0003*aurelia) – 0.03 * aurelia * nasutumd[nasutum]/dt = -1.04 * nasutum + 0.03 * aurelia * nasutum

Fills in constants from equation templates:

IPM Algorithm

1. Find all permissible instantiations of generic processes with specified variables:logistic_growth:S -> aurelialogistic_growth:S -> nasutumexponential_decay: S -> aureliaexponential_decay: S -> nasutumholling_1: S1 -> aurelia, S2 -> nasutum

IPM Algorithm (2)

2. Go from partially instantiated processes to generic models by carrying out exhaustive search of model structures, up to some complexity limitprocess logistic_growth;

parameters gr[0,3], ic[0,0.1];equations d[aurelia,t,1] = gr * aurelia * (1-ic*aurelia)

process exponential_decay;parameters dr[0,2];equations d[nasutum,t,1] = -1 * dr * nasutum

process holling_1:parameters ar[0.01, 10], ef[0.001, 0.8]equations d[aurelia,t,1] = -1 * ar * aurelia * nasutum d[nasutum,t,1] = ef * ar * aurelia * nasutum

IPM Algorithm (3)

3. Find best fitting parameters w/Least squares fitting that does 2nd order gradient descent thru parameter space Full simulation Applies standard tricks for avoiding local mins

/or/ when all variables observable Does teacher forcing No full simulation Given attributes at time t, minimize error at t+1

IPM Discussion

Finds sophisticated predator/prey relationships in synthetic and actual data Multiple predators and prey species

Successes Applies domain knowledge in scientist-friendly

fashion Groups equations and equation fragments together in

clumps that a scientist would find intuitive

Improvement? It's an algorithm, not an architecture Algorithm could be incorporated in a larger arch.

CSC 599: Computational Scientific Discovery

Documents

Transcript of CSC 599: Computational Scientific Discovery