The Role of Artificial Neural Networks in … Role of Artificial Neural Networks in Understanding...
Transcript of The Role of Artificial Neural Networks in … Role of Artificial Neural Networks in Understanding...
The Role of Artificial Neural Networks in Understanding
Complex Systems Behavior
Gary R. WeckmanOhio University
Palm Island Enviro-Informatics & Business Solutions, LLC
David F. MilliePalm Island Enviro-Informatics & Business Solutions, LLC
Loyola University, New Orleans
Andrew P. Snow
Ohio University
School of Information and Telecommunication Systems
1
What are complex systems?
• "A system comprised of a (usually large) number of (usually
strongly) interacting entities, processes, or agents, the
understanding of which requires the development, or the use of,
new scientific tools, nonlinear models, out-of equilibrium
descriptions and computer simulations." [Advances in Complex
Systems Journal]
• "A system that can be analyzed into many components having
relatively many relations among them, so that the behavior of
each component depends on the behavior of others. [Herbert
Simon]“
• "A system that involves numerous interacting agents whose
aggregate behaviors are to be understood. Such aggregate
activity is nonlinear, hence it cannot simply be derived from
summation of individual components behavior." [Jerome Singer]2
Our Research: Complex Systems
3
‘Learning’ (analogous to problem solving) is:
adaptive - knowledge is altered, updated, & stored (via weights)
iterative - examples to generalizations
‘Universal approximators’ – can discover & reproduce any
(linear / non-linear) trend given enough data & computational
(processing) capability
No expert knowledge required
Few (if any)‘formal’ assumptions - i.e. Gaussian requirements, etc.
Artificial Neural NetworksMachine-learning algorithms that identify data patterns and perform
decision making in a manner imitating cognitive functionality
Disadvantage - (superficially ? ?) lack a declarative knowledge
structure
a ‘Black Box’ (i.e. no global equation)4
Biological Analogy
• Brain Neuron
• Art if icial neuron
• Set of processing
elements (PEs) and
connect ions (weights)
with adjustable
strengthsX4
X3
X5
X1
X2
Output
Layer
Input
Layer
Hidden Layer
f(net)In
pu
ts
w1
w2
wn
5
Modeling Approach
Model Strategy
?
Data Collection
Attribute
Behavior
Accuracy
Occam’s Razor
6
Early Days: Interested in “ Model Accuracy”
7
Modeling Approach
Early Project: Stock Market Model
Closing Prices Of Stock
0
1
2
3
4
5
6
1 15 29 43 57 71 85 99 113 127 141 155 169 183 197 211 225 239
Accuracy of predict ing market turns – not necessarily
why
8
‘Paradigms’ of Scientific Discovery *
Empirical - describing natural phenomena
initiated, a thousand years ago
Theoretical - models, ‘laws’ & generalizations
✴ initiated, the last few hundred years
Computational - simulating complex phenomena
initiated, the last few decades
9
ANN: BLACK BOX
10
KNOWLEDGE EXTRACTION defined:
is the creation of knowledge from
structured (relational databases, XML)
and unstructured (text, documents,
images) sources
[https://en.wikipedia.org/wiki/]
Is there a way illuminate the black box?
Environmental Modeling & Knowledge Extraction
Pamlico
Sound
Croatan
National
Forest
Neuse River & Estuary
Variable Behavior
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
AP
I
S(%
)
tota
l nitro
gen
TA
N (
tota
l
acid
num
ber)
Satu
rate
s
Aro
matics
resin
s
asphaltenes V Ni
%C
rude o
il
Se
ns
itiv
ity
%Inhibition
Network Output(s) for Varied Input Saturates
0.9585
0.959
0.9595
0.96
0.9605
0.961
0.9615
0.962
0.9625
0.963
26.7
61
30.0
33
33.3
05
36.5
77
39.8
49
43.1
21
46.3
93
49.6
65
52.9
37
56.2
09
59.4
81
62.7
53
Varied Input Saturates
Ou
tpu
t(s)
%Inhibition
1st ATTEMPT:
• Included all attributes collected
• Sensitivity about the means
• Found many limitations to
current method
How are we to explain a more
complex situation?
11
Lake
Huron
Lake Ontario
Saginaw
Bay
Lake Huron
❖ 2nd largest GL by area
- 59,600 km2
❖ 3rd largest GL by
volume - 3,540 km3
❖ 6,157 km coastline &
134,100 km2 drainage
❖Max depth 229 m
(mean 59 m)
❖ 22 yr retention
Lake Erie
4th largest GL by area -
25,700 km2
5th largest GL by
volume - 484 km3
1402 km coastline &
7,800 km2 drainage
Max depth 282 m
(mean 85 m)
2.6 yr retention
Western
Lake Erie
12
Bathymetric contours
in 5-m intervalsSaginaw River
Outer
Bay
Lake
Huron
Inner
Bay
Inner OuterMax / MeanDepth (m) 14.0 / 5.09 40.5 / 13.66
SurfaceArea (km2) 1,554 1,217
Volume (km3) 7.91 16.63
Retentiontime (d) 58 - River mouth to
Lake proper
100 20
km Lower
Lower
Upper
Upper
Middle
5
0
km
0 10
0
Zebra Mussels & Water Quality Assessment (1990 - 1996)
1990-1996
Dreissena polymorpha
Sampling Sites
Oceans & Human Health Initiative (2003 - 2005)
2003-2005
Microcystis aeruginosa Multiple Stressor Program (2008 - 2010)
2008-2010
13
Predicting Saginaw Bay Chl a (1991-1996)
MLP - 1 Hidden Layer of 4 Processing Elements
Measured Chlorophyll a (μg L-1)
0 9 18 27 36 45
Test (n = 244)
r = 0.89 (p < 0.0001)
MSE / NMSE = 8.08 / 0.25
1:1
Mod
eled
Ch
loro
ph
yll
a(μ
g L
-1)
0
9
18
27
36
45Training (n = 586)
r = 0.79 (p < 0.0001)
MSE / NMSE = 10.69 / 0.39
Cross Validation (n = 146)
r = 0.85 (p < 0.0001)
MSE / NMSE = 4.89 / 0.27
0 9 18 27 36 45
1:1
0
9
18
27
36
45
Hydrological Predictors: oC, Sechhi, Kd, Cl, NO3, NH4, SRP, TP, SiO2, PSiO2, DOC,
POC
14
Existing Knowledge Extraction Tools
Neural Interpretation Diagram
• Decomposition method to visual
– Determine significance of input variables
– Based on the magnitude of interconnecting weights
Connected Weights
• Decomposition method that uses weights of an ANN to determine:
– Input Significance to model
– Nodes Significance to ANN
• Procedure– Calculate “connected weights” for all possible paths of the network
15
* Line thickness portrays the relative magnitude of the weight
positive (excitatory) weights
negative (inhibitory) weights
Chl a
oC
Secchi
Kd
TP
SRP
NO3
NH4
SiO2
PSiO2
CL
POC
DOC
Network Interpretive Diagram*
(of a trained network)
Garson’s Algoritthm
Relative Share of Prediction
oC
9% Secchi
6%Kd
6%
TP
12%
SRP
7%NO3
7%NH4
5%
SiO2
9%
PSiO2
13%
CL
6%
POC
11%
DOC
9%
Single Parameter Sensitivity
Analysis (± 1 SD)
0.0
0.7
1.4
2.1
2.8
3.5
16
Saginaw Bay CHL a (2008-2010) - Hydrological & Meteorological Predictors
Line thickness portrays the relative magnitude of the weight
positive (excitatory) weights
negative (inhibitory) weights
Developed More Complex Networks
17
Multi-Variable Sensitivity Analysis (circa 2006 !)
Developed New Approaches to Observe Interactions
18
Decision Trees• Symbolic Knowledge Extraction
Technique
• Most commonly used decision
tree induction algorithm – C4.5
(Quinlan)
• Recursive partitioning of the
data
• Drawback: Amount of data
reaching each node decreases
with the depth of the tree
• Alternative: TREPAN
19
TREPAN+ Methodologies
20
Simulation Based Neural Network Modeling
Simulation Block Diagram
Survivability
Availability
Number of failed
components (6)
Number of FCC-
Reportable outages
Component Failure
Timeline
MTTF (6)
MTR (6)
Time Profile
Run Time
Standard Deviation for
MTR (6)
0
5
10
15
20
25
30
35
40
45
50
0 1 2 3 4 5 6
Varied Input MTTFBSCBS
# o
f R
ep
ort
ab
le O
uta
ges
Simulation Output
Neural Netw ork Output
Input &
Output
Behavior
Scenario
Testing
Investigate training a NN network with results from wireless simulation
21
MTTFBSCBS <=3.1083
MTRBSCBS <=3.8833
MTRBSCBS <=4.7667
No Yes
MTTFBSCBS <=1.4417
YesNo
Good(10-20)
BAD(30-40)
OKAY(20-30)
YesNo
MTTFBSCBS <=1.8833
YesNo
MTTFBS<=2.3500
MTTFBS <=1.8083
OKAY(20-30)
YesNo
YesNo
No Yes
MTTFBSCBS<=2.3750
No Yes
OKAY(20-30)
Good(10-20)
Good(10-20)
VeryGood(<10)
Good(10-20)
0
20000
40000
60000
80000
100000
120000
140000
160000
180000
Sen
siti
vit
y
MTTFB
S
MTTFB
SC
MTTFB
SCBS
MTTFM
SC
MTTFM
SCBSC
MTTFD
B
MTR
BS
MTR
BSC
MTR
BSC
BS
MTR
MSC
MTR
MSC
BSC
MTR
DB
Input Name
Sensitivity About the Mean
MTTFBS
MTTFBSC
MTTFBSCBS
MTTFMSC
MTTFMSCBSC
MTTFDB
MTRBS
MTRBSC
MTRBSCBS
MTRMSC
MTRMSCBSC
MTRDB
Knowledge Extraction for Wi-Fi
Utilized Techniques:
• TREPAN+
• Sensitivity
22
Needed More Understanding: Variable Interactions
23
Multiple Variable Interactions while looking at various states!
Our drive to Mechanistic Model: Grey Box => WHITE BOX
Different Project: Crude Oil Impact
• Used New Set of Tools:
Limitations to Sensitivity:
• 2 ANNs were created for “high” and “low” %Crude Oils
• Sensitive results were very different
•%Crude Oil <=20% Crude Oil >=50%
24
Upper Mid Lower Upper Lower0.0
3.5
7.0
10.5
14.0
Inner Bay Outer Bay
a
a
a
b b
Inner vs Outer Bay:
Mann-Whitney U Statistic = 559
T = 1024; p = < 0.001
Means with the same letter are
not different (Dunn’s Method);
p > 0.05
Mean [Chlorophyll a]
Saginaw
River
Lake
Huron
Bay Reach:
Kruskal Wallis ANOVA on Ranks = 559
H = 50.65, df = 4; p = < 0.001
Ch
loro
ph
yll
a (
ug
L-1
)
Revised Look: Saginaw Bay 2008 - 2010
Outer BayInner Bay
TM
P, A
TM
P, W
nd
DirPC
2 –
17
.1 %
PC 1 – 22.3 %
TP, TSS, PON, POC SECCHI
-10 -5 0 5
-5
0
5
10
TD
P, N
O3,
NH
4
PCA – Hydrological / Meteorological VariablesMonthly Means
Distinct difference in Inner and Outer Bays25
1:1Training; n = 151
r = 0.95, NMSE = 0.10
Cross-Validation; n = 50
r = 0.88, NMSE = 0.22
0
15
30
45
60
Measured CHL a (g L-1)
0 15 30 45 60
0
15
30
45
60
Mod
eled
CH
L a
(
g L
-1)
1:1Test; n = 49
r = 0.98, NMSE = 0.05
0.00
1.25
2.50
3.75
5.00+ 1 SD, Common Variation
+ 2 SD, Disturbance Variation
S
D C
HL
a
Local Sensitivity
Distinct differences based on
range of Sensitivity
26
CHL as a function of TP (across TEMP slices)
Total Phosphorus (g L-1)23018446 13892
0
125
100
25
75
50
Mod
eled
Ch
loro
ph
yll
a(
gL
-1)
0
0
125
100
25
75
50
12
3228
16
24
20
230
0
138
4692
184
CHL as a function of TP & TEMP
CHL a = 95.09 – (0.30*TP) – (9.96*TEMP) +
(4.49e-4*TP2) + (0.25*TEMP2) +
(0.02*TP*TEMP) adj r2 = 0.98, Fit SE = 4.09, Fstat = 2478.82
Introduced New Visualizations: Multi-variable Impact on Chlorophyll a
Mod
eled
Ch
loro
ph
yll
a(
gL
-1)
CHL a0.5 = 1.98 + (0.03*TP)adj r2 = 0.99, Fit SE = 0.41, Fstat = 29857.36
ln CHL a = 2.23 + (0.002*TEMP2)adj r2 = 0.99, Fit SE = 1.03, Fstat = 6323.88
CHL a = -862.16 + (473.88*WndSpdAve-3) - (103.65*WndSpdAve-32)
+ (12.14*WndSpdAve-33) - (0.82*WndSpdAve-3
4) + (0.03*WndSpdAve-
35) - (0.001*WndSpdAve-3
6) + (5.80e-6*WndSpdAve-37)
adj r2 = 0.99, Fit SE = 0.13, Fstat = 13,127.67
0
90
18
54
36
72
TP50 = 51.8 g L-1
TEMP50 = 22.6 oC
WndSpd-3-50 = 18.0 km hr-1
Half-Maximal Abundance
Concentrations / Conditions:
27
[CHL a] =
6.08 + (-0.77*[TP]) - (52.22/[NH4-Ν]) +
(0.04*[TP]2) + (18.04/[NH4-Ν]2) +
(7.68*[TP]/[NH4-Ν]) - (0.001*[TP]3) +
(35.4/[NH4-Ν]3) - (8.62*TP/[NH4-Ν]2) -
(0.06*[TP]2/[NH4-Ν])
[CHL a] =
1 / -4.15 + (-0.03*ln[TP]) +
(5.14/[TP]1.5) + (1.94*[NO3-Ν]) -
(11.90*Exp [NO3-Ν]/-0.8267) -
(1.7E-05/[NO3-Ν]) + (16.16*Exp[NO3-Ν])
CHL a as a function of TP & NO3-N CHL a as a function of TP & NH4-N
Delineating TP Thresholds for Saginaw Bay CHL a (2008-2010)(Taking Into Account the Interactions and/or Synergisms of Co-Limiting Nutrients)
25 µg TP L-1 25 µg TP L-1
TP (μg L-1)
NO
3-N
(m
g L
-1)
NΗ
4-N
(μ
g L
-1)
TP (μg L-1) 28
29
Generalized Equation
for 2 variable interaction
with output (CHL a)
Development of Grey Box Technique
Iterations : ANNs Models
Multiple ANN models
utilizing 2 variables at a
time to predict Output
Iterations: Additive Models Finalized Combined Model
30
Global Sensitivity
• Sensitivity about Means
– Local Sensitivity
– Does not consider variable interactions as states change
• Developed Global Sensitivity
– Looks at how variables interact as their states change!
31
32
Each Variable has its own
distribution of values (States)
Impact of Correlation on State
Behavior
PON Secchi TSS TP TDP SRP NH4 NO3 CL Sol_Si POC DOC
-1.25 σ 1.57 -0.70 -0.98 -0.48 -0.25 0.02 -0.02 -0.57 -0.16 -1.16 -0.80
-0.75 σ 0.53 -0.67 -0.59 -0.04 -0.14 0.09 0.41 -0.02 -0.40 -0.79 0.04
-0.25 σ -0.17 -0.08 -0.16 -0.11 -0.09 -0.09 -0.04 -0.14 -0.09 -0.26 -0.14
0.25 σ -0.40 -0.02 0.14 0.13 0.04 -0.26 -0.24 -0.16 0.39 0.35 -0.06
0.75 σ -0.68 0.50 0.31 -0.37 -0.06 -0.49 -0.35 -0.06 0.20 0.87 0.15
1.25 σ -0.75 0.64 1.58 0.97 0.72 -0.08 -0.64 0.89 0.31 1.42 0.42
1.75 σ -0.85 1.35 0.95 0.05 0.45 -0.45 -0.55 0.95 0.25 2.00 0.95
Global Sensitivity
Global Variation Across States
Significant difference in Global versus Local Sensitivity
33
Global (State Based) versus Local (Means) Sensitivity
34
Lake Erie Microcystis (Continuous MLP); Hydrological & Meteorological
HLs: 32-15-14-10-1, TanH/Mom
Training & Cross-Validation
Mod
eled
Mic
rocy
stis
(Log
10
m3
L-1
)
Measured Microcystis (Log10 m3 L-1)
4
12
10
6
8
4
12
10
6
8
Test Application
4 12106 8
State-Based Global Sensitivity;
Min-Max (Optimal 12 Predictors)
Mod
eled
Mic
rocy
stis
(Log
10
m3
L-1
)
35
Ecology & ‘Big’ Data:
Not all ‘Big Data’ created equally:
volume, variety, velocity, volatility, veracity
No longer ‘… your daddy’s database …’
‘Big’ Data = ‘Big’ Information = ‘Big’ Value
Does ‘Big’ Data ensure ‘Big’ Science
?
Data Issues
• Big: Random reduction
• Little: Synthetic (SMOTE)
• Imbalance Data
• 0’s
36
Imbalanced Datasets
• Definition: under or over representation of a class in a dataset is considered as an imbalance in a dataset.
• Ill-balanced, unbalanced, uneven
Balanced DatasetImbalanced Dataset
Balanced DatasetImbalanced Dataset 37
Graphic showing change under/over Sampling
Under Sampling Over Sampling
38
SMOTE’s Informed Oversampling Procedure
39
Smote: Synthetic
Minority Over-
sampling Technique
Lake Erie (2009-2011) Chlorophyll a & Microcystis Distributions
World Health Organization Guidance Values for Acute Health Effects of
Cyanobacteria-Dominated Waters *
* after Chorus & Bartram 1999
‘Moderate’ Risk(> 10-50 g L-1)
‘Low’Risk(< 10 g L-1)
‘High’ Risk(> 50 g L-1)
n = 62
PresenceAbsence
Continuous
A ‘two-step’ approach
(akin to GLiMMs)
‘High’ Risk(> 108 cells L-1)
‘Moderate’Risk(> 2 x 107 cells L-1)
‘Very High’
Risk(> 1010 cells L-1)
‘Low’ Risk(< 2 x 107 cells L-1)
40
Training &
Cross Validation (class imbalance
corrected via
SMOTE)
Absent Present
Absent130 10
Present8 151
Total 299
Test
ApplicationAbsent Present
Absent 10 7
Present 4 65
Total 86
Accuracy (% correct) - 93.98
% Absent Correct – 94.20
% Present Correct – 93.79
Accuracy (% correct) - 89.0
% Absent Correct – 71.43
% Present Correct – 90.23
Lake Erie Microcystis (Presence-Absence MLP); Hydrological & Meteorological
HLs: 29-15-10-5-1
0
1.0
0.8
0.2
0.6
0.4
Mic
rocy
sis
Occ
urr
en
ce P
rob
ab
ilit
yB)
Presence
Absence
TPP-A = 43 g L-1
TEMPP-A = 21 oC
WndSpdP-A = 19 km hr-1
Concentrations / Conditions for
Occurrence Likelihood of Microcystis:
41
Visualizing Predictive Variances & Uncertainties for Microcystis (Continuous)
5.5
11.0
9.9
6.6
8.8
7.7
10.0
32.0
27.6
14.4
23.218.8
225
0
135
4590
180
Mod
eled
Mic
rocy
stis
(Log
10
m3
L-1
)
5.5
11.0
9.9
6.6
8.8
7.7
Total Phosphorus (g L-1)22513545 90 1800
Microcystis as a function of TP & TEMP
Microcystis as a function of TP (by TEMP slices)
TP50 = 50.4 g L-1
TEMP50 = 26.3 oC
WndSpd(-3D)50 = 23.8 km hr-1
Mod
eled
Mic
rocy
stis
(Log
10
m3
L-1
)
5.5
11.0
9.9
8.8
6.6
7.7
Half-Maximal Abundance
Concentrations / Conditions:
42
So …. Why Care About This Data ‘Stuff’ ?
‘Paerl-o-gram’ courtesy of
Hans Paerl, Univ. North Carolina - Chapel Hill
This is were we are TODAY!
43
Machine-
learning
analytics
Machine-learning algorithms capable of autonomously unearthing and
reproducing complex patterns within sizeable data quantities afford great
potential for fueling ecological hypothesis creation and ‘intelligent’
knowledge derivation (here, ‘Robo-ecology’).
Still more effort to develop and investigate new ideas
44