CSC 599: Computational Scientific Discovery
-
Upload
jayme-holloway -
Category
Documents
-
view
35 -
download
1
description
Transcript of CSC 599: Computational Scientific Discovery
CSC 599: Computational Scientific Discovery
Lecture 8: Lagramge (sp) and Inductive
Process Modeling
Outline
Review Processes BACON Other equation finders The act of equation finding
Lagramge (sp)
Inductive Process Modeling
Review: Processes
Deal with changes over time Range
(start, finish), (start, duration) Rates of change
dx/dt Previous history
attribute[event[t]] = f(event[1], event[2], . . . event[t-1])
Attributes of processes (when time is included) Quantity of “always on” forces as function of time
Gravity, electro-magnetism Maximum limit of “homeostatic” forces
Friction, Normal force Misc. changes during process
Review: BACON
BACON was driven by the data and domain knowledge: Distinguishing between independent and
dependent attribute BACON 3
Discovery of intrinsic properties of conceptual values
BACON 4 Preference for symmetric equations given
knowledge that suggests one could exist BACON 5
Equation Finders, 1990s
[From Ljupco Todorovski's Homepage:] http://www-ai.ijs.si/~ljupco/ COPER [Kokar 1986]
Uses information about the dimension units of the system variables to restrict the space of possible equation structures.
BACON [Langley 1987] Pioneer among equation discovery systems. It uses a
set of data-driven heuristics for finding regularities (constancies and trends) in data and for formulating hypotheses based on them.
Equation Finders, 1990s (2)
Fahrenheit/EF[Langley and Zytkow 1989], [Zembowitz and Zytkow 1992]
Used as a equation discovery subsystem of the scientific discovery system. For discovering bivariate equations only, user being able to specify the set of operators and functions to be used within equations.
ABACUS [Falkenhainer and Michalski 1990] Experiments with different search strategies through
the space of equation structures. Also allows discovery of piecewise equations using clustering for identifying the limits between pieces.
IDS [Nordhausen and Langley 1990] ARC [Moulet 1992]
Equation Finders, 1990s (3)
E* [Schaffer 1993] Discovers of bivariate equations using a small set of
predefined equation structures. LAGRANGE [Dzeroski and Todorovski 1995]
Handled differential equations. GOLDHORN [Krizman et al. 1995]
Extended LAGRANGE towards discovery from noisy data.
SDS [Washio and Motoda 1997] Used information about scale types of the dimension
units of the system variables to restrict the space of possible equations.
LAGRAMGE [Todorovski and Dzeroski 1997] Allowed the user to specify the space of possible
equations with context free grammar.
Issues with equation finders
1. How to incorporate timeEsp. derivatives
2.How separate Heuristics used to search eqn space Grading fnc used to determine how well eqn fits:
Compare: True function = x2
Current function = -x2
Large “error” in numeric space Small “error” in symbolic (eqn space)
3. How best to incorporate domain knowledge? BACON programs did so, but was ad hoc
Lagrange:
The Person Italian/French Mathematician and physicist 1736-1813
The System Mid-1990s equation finder Found differential equations Dzeroski and Todorovski 1995
Lagramge (sp)
Lagramge = Lagrange (system) + grammar
Handling the issues:1. How to account for time?
Each vector of variable values can represent the state of the system at a particular time
Throw both attr and d[attr]/dt in variable set2. Separating
a) Heuristics for searching equation spaceExplicit user-defined function for judging fnc similiarity
b) Gauging equation fit Sum of squared errors
Lagramge, top algorithm
void lagramge(Grammar g, int maxComplexity, int maxBeamWidth, fnc fittingFnc(), fnc eqnPreferFnc(), fnc stoppingCriteriaMet()){T0 = init symbolQ1 = {T0}do { Q = Q1 R = Q.generateRefinements(); foreach r in R { r.calcBestFitConstants(fittingFnc); r.symbolicCloseness = eqnPreferFnc.calc(r); } Q1 = union(Q,R); Q1.keepBest(maxBeamWidth,symbolicCloseness); }while ( (Q != Q1) && !stoppingCriteriaMet() );}
Heuristic functions:
fnc fittingFnc()Job = given this equation parse tree, find best fitting
parametersattr
learn = c
1*attr
1 + c
2*attr
2 + c
3* attr
1*attr
2 + c
4
Distinguish between: space of equation parse trees (countable) space of all the floating point parameters (in principle,
uncountable)
fnc eqnPreferFnc()Job = given this instantiated equation, how well does
it fit the data?Traditional least squares, or some variant
Noise tolerant
Lagramge Handling Issues (2)3. Incorporation of domain knowledge
Use a grammar!Inputs:
1. Context free grammar(More on next slide)
2. Input data D = (V,vd,M)
V = set of variablesv
d = variable to predict in V
M = measurements of all vars:time v
1v
2. . . v
n
t0
v1,0
v2,0
vn,0
t1
v1,1
v2,1
vn,1
. . .
Outputs:Ordinary or differential equation
Grammar
Context Free Grammar =<nonTerminalSet, terminalSet, productionSet, startSym>
Ex:double monod(double c, double v) { return(v/(v+c)); }N = {E, F, M, v}T = {+, const, *, monod, (“,”), N, P, Z}P = { E -> const | const*F | E + const*F F -> v | M | v*M M -> monod(const,v) }S = E
Tree's “Height” as its complexity
The deeper the tree, the more complex the expression
y = f1(x), y = f1(f2(x)), y = f1(f2(f3(x)))
Sym Height = shallowest height of deepest prod.p = A -> A
1,A
2, . . .A
l
h(p) = 1 + maxi {h(A
i)}
h(A) = if (isNonTerminal(A)) minq in prods of A
{h(q)} else /* isTerminal(A) */ 0
height, production1 E->const; v->N; v->P; V->Z2 M->monod(const,v); F->v3 F->M; F->v*M; E->const*F; E->E+const*F
Lagramge Refinement
void refinement (Tree t, int maxHeight){choose A, nonterminal in Tp
A,i = production applied at that node
l = pathLength from root T to Adelete subtrees of Aif ( (i+1) <= A.numProductionsFor()) && (l + height(p
A,i+1) <= maxHeight) ) {
pA,i+1
= A->A1,A
2, . . . A
l
replace pA,i
with pA,i+1
expand A with successors A1,A
2, . . . A
l
do { choose nonTerm leaf B if T p
B,1 = B->B
1,B
2, . . . B
m
expand B with successors B1,B
2, . . . B
m
} while ( T.hasNonTerminals() );}
Refinement Example
Lagramge Findings (1):
Aquatic Ecosystems: N = concentration of nutrient P = concentration of phytoplankton Z = concentration of zooplankton
dN/dt = -NP/(kN + N)
dP/dt = NP/(kN + N) – r
PP – PZ(k
P+P)
dZ/dt = PZ(kP+P) - rZZ
Lagramge Findings (2):
Two poles on a cart (inverted pendulum):
Lagramge Discussion
Problems it solves Handling noisy data
sum of squared error Handling of time and derivatives
Derivatives are just another variable Nothing special to do on user's part
Distinguishes between error in symbol space and error in numeric value space
Heuristic fnc for 1st, grading fnc for 2nd
Incorporation of domain knowledge Symbol space heuristic fnc Grammar
Lagramge Discussion (2)
Hold up one second! Do you believe this?Incorporation of domain knowledge
Grammar
Grammer ?!? Computer scientists think in terms of grammars
FSM/regular expression PDA/context free Turing Machines/context sensitive
Do scientists think in terms of grammars?(Besides linguists, of course)
Lagramge Discussion (3)
Remember BACON 5 heuristic: If have structural knowledge that suggests a
symmetric equation than look for one first Can we do that with Lagramge?
Grammar for symmetry(?)A -> BCB (not quite, both B's are no distinct)A -> c*B + c*D (symmetric but not that powerful)A -> (B) (ditto)
Preference for symmetry Jury-rig eqnPreferFnc() as needed Are you happy with that?
Inductive Process Modeling
IPM
We want it all! Processes
Explicitly recognized objects (in software engineering sense)
Distinguish between instances and classes higher level constructs with uninstantiated equations Simulation equations constructed from process ones Automatically deals with time
Exhaustive search To some maximal complexity Better fitting function for time series data
IPM Generic ProcessesLibrary pred_preygeneric process logistic_growth;
variables S[species];parameters gr[0,3], ic[0,0,1];equations d[S,t,1] = gr * S * (1-ic*S)
generic process exponential_growth;variables S{species};parameters gr[0,3];equations d[S,t,1] = gr * S;
generic process exponential_decay;variables S{species};parameters dr[0,2];equations d[S,t,1] = -1*dr * S;
generic process holling_1;variables S1{prey}, S2{predator};parameters ar[0.01,10],
ef[0.001,0.8];equations d[S1,t,1] = -1 * ar * S1 *
S2;d[S2,t,1] = ef * ar * S1 * S2
IPM Quantitative Processes:
Model PredatorPrey:vars: aurelia{prey}, nasutum{predator};observable: aurelia, nasutum;process aurelia_growth;
equations d[aurelia,t,1] = 1.81 * aurelia * (1-0.0003*aurelia);process nasutum_decay
equations d[nasutum,t,1] = -1 * 1.04 * nasutum;process predation_holling_t:
equations d[aurelia,t,1] = -1 * 0.03 * aurelia * nasutum; d[nasutum,t,1] = 0.30 * 0.03 * aurelia * nasutum;
d[aurelia]/dt = 1.81 * aurelia * (1-0.0003*aurelia) – 0.03 * aurelia * nasutumd[nasutum]/dt = -1.04 * nasutum + 0.03 * aurelia * nasutum
Fills in constants from equation templates:
IPM Algorithm
1. Find all permissible instantiations of generic processes with specified variables:logistic_growth:S -> aurelialogistic_growth:S -> nasutumexponential_decay: S -> aureliaexponential_decay: S -> nasutumholling_1: S1 -> aurelia, S2 -> nasutum
IPM Algorithm (2)
2. Go from partially instantiated processes to generic models by carrying out exhaustive search of model structures, up to some complexity limitprocess logistic_growth;
parameters gr[0,3], ic[0,0.1];equations d[aurelia,t,1] = gr * aurelia * (1-ic*aurelia)
process exponential_decay;parameters dr[0,2];equations d[nasutum,t,1] = -1 * dr * nasutum
process holling_1:parameters ar[0.01, 10], ef[0.001, 0.8]equations d[aurelia,t,1] = -1 * ar * aurelia * nasutum d[nasutum,t,1] = ef * ar * aurelia * nasutum
IPM Algorithm (3)
3. Find best fitting parameters w/Least squares fitting that does 2nd order gradient descent thru parameter space Full simulation Applies standard tricks for avoiding local mins
/or/ when all variables observable Does teacher forcing No full simulation Given attributes at time t, minimize error at t+1
IPM Discussion
Finds sophisticated predator/prey relationships in synthetic and actual data Multiple predators and prey species
Successes Applies domain knowledge in scientist-friendly
fashion Groups equations and equation fragments together in
clumps that a scientist would find intuitive
Improvement? It's an algorithm, not an architecture Algorithm could be incorporated in a larger arch.