The Role of Artificial Neural Networks in … Role of Artificial Neural Networks in Understanding...

The Role of Artificial Neural Networks in Understanding

Complex Systems Behavior

Gary R. WeckmanOhio University

Palm Island Enviro-Informatics & Business Solutions, LLC

David F. MilliePalm Island Enviro-Informatics & Business Solutions, LLC

Loyola University, New Orleans

Andrew P. Snow

Ohio University

School of Information and Telecommunication Systems

1

What are complex systems?

• "A system comprised of a (usually large) number of (usually

strongly) interacting entities, processes, or agents, the

understanding of which requires the development, or the use of,

new scientific tools, nonlinear models, out-of equilibrium

descriptions and computer simulations." [Advances in Complex

Systems Journal]

• "A system that can be analyzed into many components having

relatively many relations among them, so that the behavior of

each component depends on the behavior of others. [Herbert

Simon]“

• "A system that involves numerous interacting agents whose

aggregate behaviors are to be understood. Such aggregate

activity is nonlinear, hence it cannot simply be derived from

summation of individual components behavior." [Jerome Singer]2

Our Research: Complex Systems

3

‘Learning’ (analogous to problem solving) is:

adaptive - knowledge is altered, updated, & stored (via weights)

iterative - examples to generalizations

‘Universal approximators’ – can discover & reproduce any

(linear / non-linear) trend given enough data & computational

(processing) capability

No expert knowledge required

Few (if any)‘formal’ assumptions - i.e. Gaussian requirements, etc.

Artificial Neural NetworksMachine-learning algorithms that identify data patterns and perform

decision making in a manner imitating cognitive functionality

Disadvantage - (superficially ? ?) lack a declarative knowledge

structure

a ‘Black Box’ (i.e. no global equation)4

Biological Analogy

• Brain Neuron

• Art if icial neuron

• Set of processing

elements (PEs) and

connect ions (weights)

with adjustable

strengthsX4

X3

X5

X1

X2

Output

Layer

Input

Layer

Hidden Layer

f(net)In

pu

ts

w1

w2

wn

5

Modeling Approach

Model Strategy

?

Data Collection

Attribute

Behavior

Accuracy

Occam’s Razor

6

Early Days: Interested in “ Model Accuracy”

7

Modeling Approach

Early Project: Stock Market Model

Closing Prices Of Stock

0

1

2

3

4

5

6

1 15 29 43 57 71 85 99 113 127 141 155 169 183 197 211 225 239

Accuracy of predict ing market turns – not necessarily

why

8

‘Paradigms’ of Scientific Discovery *

Empirical - describing natural phenomena

initiated, a thousand years ago

Theoretical - models, ‘laws’ & generalizations

✴ initiated, the last few hundred years

Computational - simulating complex phenomena

initiated, the last few decades

9

ANN: BLACK BOX

10

KNOWLEDGE EXTRACTION defined:

is the creation of knowledge from

structured (relational databases, XML)

and unstructured (text, documents,

images) sources

[https://en.wikipedia.org/wiki/]

Is there a way illuminate the black box?

Environmental Modeling & Knowledge Extraction

Pamlico

Sound

Croatan

National

Forest

Neuse River & Estuary

Variable Behavior

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

AP

I

S(%

)

tota

l nitro

gen

TA

N (

tota

l

acid

num

ber)

Satu

rate

s

Aro

matics

resin

s

asphaltenes V Ni

%C

rude o

il

Se

ns

itiv

ity

%Inhibition

Network Output(s) for Varied Input Saturates

0.9585

0.959

0.9595

0.96

0.9605

0.961

0.9615

0.962

0.9625

0.963

26.7

61

30.0

33

33.3

05

36.5

77

39.8

49

43.1

21

46.3

93

49.6

65

52.9

37

56.2

09

59.4

81

62.7

53

Varied Input Saturates

Ou

tpu

t(s)

%Inhibition

1st ATTEMPT:

• Included all attributes collected

• Sensitivity about the means

• Found many limitations to

current method

How are we to explain a more

complex situation?

11

Lake

Huron

Lake Ontario

Saginaw

Bay

Lake Huron

❖ 2nd largest GL by area

- 59,600 km2

❖ 3rd largest GL by

volume - 3,540 km3

❖ 6,157 km coastline &

134,100 km2 drainage

❖Max depth 229 m

(mean 59 m)

❖ 22 yr retention

Lake Erie

4th largest GL by area -

25,700 km2

5th largest GL by

volume - 484 km3

1402 km coastline &

7,800 km2 drainage

Max depth 282 m

(mean 85 m)

2.6 yr retention

Western

Lake Erie

12

Bathymetric contours

in 5-m intervalsSaginaw River

Outer

Bay

Lake

Huron

Inner

Bay

Inner OuterMax / MeanDepth (m) 14.0 / 5.09 40.5 / 13.66

SurfaceArea (km2) 1,554 1,217

Volume (km3) 7.91 16.63

Retentiontime (d) 58 - River mouth to

Lake proper

100 20

km Lower

Lower

Upper

Upper

Middle

5

0

km

0 10

0

Zebra Mussels & Water Quality Assessment (1990 - 1996)

1990-1996

Dreissena polymorpha

Sampling Sites

Oceans & Human Health Initiative (2003 - 2005)

2003-2005

Microcystis aeruginosa Multiple Stressor Program (2008 - 2010)

2008-2010

13

Predicting Saginaw Bay Chl a (1991-1996)

MLP - 1 Hidden Layer of 4 Processing Elements

Measured Chlorophyll a (μg L-1)

0 9 18 27 36 45

Test (n = 244)

r = 0.89 (p < 0.0001)

MSE / NMSE = 8.08 / 0.25

1:1

Mod

eled

Ch

loro

ph

yll

a(μ

g L

-1)

0

9

18

27

36

45Training (n = 586)

r = 0.79 (p < 0.0001)

MSE / NMSE = 10.69 / 0.39

Cross Validation (n = 146)

r = 0.85 (p < 0.0001)

MSE / NMSE = 4.89 / 0.27

0 9 18 27 36 45

1:1

0

9

18

27

36

45

Hydrological Predictors: oC, Sechhi, Kd, Cl, NO3, NH4, SRP, TP, SiO2, PSiO2, DOC,

POC

14

Existing Knowledge Extraction Tools

Neural Interpretation Diagram

• Decomposition method to visual

– Determine significance of input variables

– Based on the magnitude of interconnecting weights

Connected Weights

• Decomposition method that uses weights of an ANN to determine:

– Input Significance to model

– Nodes Significance to ANN

• Procedure– Calculate “connected weights” for all possible paths of the network

15

* Line thickness portrays the relative magnitude of the weight

positive (excitatory) weights

negative (inhibitory) weights

Chl a

oC

Secchi

Kd

TP

SRP

NO3

NH4

SiO2

PSiO2

CL

POC

DOC

Network Interpretive Diagram*

(of a trained network)

Garson’s Algoritthm

Relative Share of Prediction

oC

9% Secchi

6%Kd

6%

TP

12%

SRP

7%NO3

7%NH4

5%

SiO2

9%

PSiO2

13%

CL

6%

POC

11%

DOC

9%

Single Parameter Sensitivity

Analysis (± 1 SD)

0.0

0.7

1.4

2.1

2.8

3.5

16

Saginaw Bay CHL a (2008-2010) - Hydrological & Meteorological Predictors

Line thickness portrays the relative magnitude of the weight

positive (excitatory) weights

negative (inhibitory) weights

Developed More Complex Networks

17

Multi-Variable Sensitivity Analysis (circa 2006 !)

Developed New Approaches to Observe Interactions

18

Decision Trees• Symbolic Knowledge Extraction

Technique

• Most commonly used decision

tree induction algorithm – C4.5

(Quinlan)

• Recursive partitioning of the

data

• Drawback: Amount of data

reaching each node decreases

with the depth of the tree

• Alternative: TREPAN

19

TREPAN+ Methodologies

20

Simulation Based Neural Network Modeling

Simulation Block Diagram

Survivability

Availability

Number of failed

components (6)

Number of FCC-

Reportable outages

Component Failure

Timeline

MTTF (6)

MTR (6)

Time Profile

Run Time

Standard Deviation for

MTR (6)

0

5

10

15

20

25

30

35

40

45

50

0 1 2 3 4 5 6

Varied Input MTTFBSCBS

# o

f R

ep

ort

ab

le O

uta

ges

Simulation Output

Neural Netw ork Output

Input &

Output

Behavior

Scenario

Testing

Investigate training a NN network with results from wireless simulation

21

MTTFBSCBS <=3.1083

MTRBSCBS <=3.8833

MTRBSCBS <=4.7667

No Yes

MTTFBSCBS <=1.4417

YesNo

Good(10-20)

BAD(30-40)

OKAY(20-30)

YesNo

MTTFBSCBS <=1.8833

YesNo

MTTFBS<=2.3500

MTTFBS <=1.8083

OKAY(20-30)

YesNo

YesNo

No Yes

MTTFBSCBS<=2.3750

No Yes

OKAY(20-30)

Good(10-20)

Good(10-20)

VeryGood(<10)

Good(10-20)

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

Sen

siti

vit

y

MTTFB

S

MTTFB

SC

MTTFB

SCBS

MTTFM

SC

MTTFM

SCBSC

MTTFD

B

MTR

BS

MTR

BSC

MTR

BSC

BS

MTR

MSC

MTR

MSC

BSC

MTR

DB

Input Name

Sensitivity About the Mean

MTTFBS

MTTFBSC

MTTFBSCBS

MTTFMSC

MTTFMSCBSC

MTTFDB

MTRBS

MTRBSC

MTRBSCBS

MTRMSC

MTRMSCBSC

MTRDB

Knowledge Extraction for Wi-Fi

Utilized Techniques:

• TREPAN+

• Sensitivity

22

Needed More Understanding: Variable Interactions

23

Multiple Variable Interactions while looking at various states!

Our drive to Mechanistic Model: Grey Box => WHITE BOX

Different Project: Crude Oil Impact

• Used New Set of Tools:

Limitations to Sensitivity:

• 2 ANNs were created for “high” and “low” %Crude Oils

• Sensitive results were very different

•%Crude Oil <=20% Crude Oil >=50%

24

Upper Mid Lower Upper Lower0.0

3.5

7.0

10.5

14.0

Inner Bay Outer Bay

a

a

a

b b

Inner vs Outer Bay:

Mann-Whitney U Statistic = 559

T = 1024; p = < 0.001

Means with the same letter are

not different (Dunn’s Method);

p > 0.05

Mean [Chlorophyll a]

Saginaw

River

Lake

Huron

Bay Reach:

Kruskal Wallis ANOVA on Ranks = 559

H = 50.65, df = 4; p = < 0.001

Ch

loro

ph

yll

a (

ug

L-1

)

Revised Look: Saginaw Bay 2008 - 2010

Outer BayInner Bay

TM

P, A

TM

P, W

nd

DirPC

2 –

17

.1 %

PC 1 – 22.3 %

TP, TSS, PON, POC SECCHI

-10 -5 0 5

-5

0

5

10

TD

P, N

O3,

NH

4

PCA – Hydrological / Meteorological VariablesMonthly Means

Distinct difference in Inner and Outer Bays25

1:1Training; n = 151

r = 0.95, NMSE = 0.10

Cross-Validation; n = 50

r = 0.88, NMSE = 0.22

0

15

30

45

60

Measured CHL a (g L-1)

0 15 30 45 60

0

15

30

45

60

Mod

eled

CH

L a

(

g L

-1)

1:1Test; n = 49

r = 0.98, NMSE = 0.05

0.00

1.25

2.50

3.75

5.00+ 1 SD, Common Variation

+ 2 SD, Disturbance Variation

S

D C

HL

a

Local Sensitivity

Distinct differences based on

range of Sensitivity

26

CHL as a function of TP (across TEMP slices)

Total Phosphorus (g L-1)23018446 13892

0

125

100

25

75

50

Mod

eled

Ch

loro

ph

yll

a(

gL

-1)

0

0

125

100

25

75

50

12

3228

16

24

20

230

0

138

4692

184

CHL as a function of TP & TEMP

CHL a = 95.09 – (0.30*TP) – (9.96*TEMP) +

(4.49e-4*TP2) + (0.25*TEMP2) +

(0.02*TP*TEMP) adj r2 = 0.98, Fit SE = 4.09, Fstat = 2478.82

Introduced New Visualizations: Multi-variable Impact on Chlorophyll a

Mod

eled

Ch

loro

ph

yll

a(

gL

-1)

CHL a0.5 = 1.98 + (0.03*TP)adj r2 = 0.99, Fit SE = 0.41, Fstat = 29857.36

ln CHL a = 2.23 + (0.002*TEMP2)adj r2 = 0.99, Fit SE = 1.03, Fstat = 6323.88

CHL a = -862.16 + (473.88*WndSpdAve-3) - (103.65*WndSpdAve-32)

+ (12.14*WndSpdAve-33) - (0.82*WndSpdAve-3

4) + (0.03*WndSpdAve-

35) - (0.001*WndSpdAve-3

6) + (5.80e-6*WndSpdAve-37)

adj r2 = 0.99, Fit SE = 0.13, Fstat = 13,127.67

0

90

18

54

36

72

TP50 = 51.8 g L-1

TEMP50 = 22.6 oC

WndSpd-3-50 = 18.0 km hr-1

Half-Maximal Abundance

Concentrations / Conditions:

27

[CHL a] =

6.08 + (-0.77*[TP]) - (52.22/[NH4-Ν]) +

(0.04*[TP]2) + (18.04/[NH4-Ν]2) +

(7.68*[TP]/[NH4-Ν]) - (0.001*[TP]3) +

(35.4/[NH4-Ν]3) - (8.62*TP/[NH4-Ν]2) -

(0.06*[TP]2/[NH4-Ν])

[CHL a] =

1 / -4.15 + (-0.03*ln[TP]) +

(5.14/[TP]1.5) + (1.94*[NO3-Ν]) -

(11.90*Exp [NO3-Ν]/-0.8267) -

(1.7E-05/[NO3-Ν]) + (16.16*Exp[NO3-Ν])

CHL a as a function of TP & NO3-N CHL a as a function of TP & NH4-N

Delineating TP Thresholds for Saginaw Bay CHL a (2008-2010)(Taking Into Account the Interactions and/or Synergisms of Co-Limiting Nutrients)

25 µg TP L-1 25 µg TP L-1

TP (μg L-1)

NO

3-N

(m

g L

-1)

NΗ

4-N

(μ

g L

-1)

TP (μg L-1) 28

29

Generalized Equation

for 2 variable interaction

with output (CHL a)

Development of Grey Box Technique

Iterations : ANNs Models

Multiple ANN models

utilizing 2 variables at a

time to predict Output

Iterations: Additive Models Finalized Combined Model

30

Global Sensitivity

• Sensitivity about Means

– Local Sensitivity

– Does not consider variable interactions as states change

• Developed Global Sensitivity

– Looks at how variables interact as their states change!

31

32

Each Variable has its own

distribution of values (States)

Impact of Correlation on State

Behavior

PON Secchi TSS TP TDP SRP NH4 NO3 CL Sol_Si POC DOC

-1.25 σ 1.57 -0.70 -0.98 -0.48 -0.25 0.02 -0.02 -0.57 -0.16 -1.16 -0.80

-0.75 σ 0.53 -0.67 -0.59 -0.04 -0.14 0.09 0.41 -0.02 -0.40 -0.79 0.04

-0.25 σ -0.17 -0.08 -0.16 -0.11 -0.09 -0.09 -0.04 -0.14 -0.09 -0.26 -0.14

0.25 σ -0.40 -0.02 0.14 0.13 0.04 -0.26 -0.24 -0.16 0.39 0.35 -0.06

0.75 σ -0.68 0.50 0.31 -0.37 -0.06 -0.49 -0.35 -0.06 0.20 0.87 0.15

1.25 σ -0.75 0.64 1.58 0.97 0.72 -0.08 -0.64 0.89 0.31 1.42 0.42

1.75 σ -0.85 1.35 0.95 0.05 0.45 -0.45 -0.55 0.95 0.25 2.00 0.95

Global Sensitivity

Global Variation Across States

Significant difference in Global versus Local Sensitivity

33

Global (State Based) versus Local (Means) Sensitivity

34

Lake Erie Microcystis (Continuous MLP); Hydrological & Meteorological

HLs: 32-15-14-10-1, TanH/Mom

Training & Cross-Validation

Mod

eled

Mic

rocy

stis

(Log

10

m3

L-1

)

Measured Microcystis (Log10 m3 L-1)

4

12

10

6

8

4

12

10

6

8

Test Application

4 12106 8

State-Based Global Sensitivity;

Min-Max (Optimal 12 Predictors)

Mod

eled

Mic

rocy

stis

(Log

10

m3

L-1

)

35

Ecology & ‘Big’ Data:

Not all ‘Big Data’ created equally:

volume, variety, velocity, volatility, veracity

No longer ‘… your daddy’s database …’

‘Big’ Data = ‘Big’ Information = ‘Big’ Value

Does ‘Big’ Data ensure ‘Big’ Science

?

Data Issues

• Big: Random reduction

• Little: Synthetic (SMOTE)

• Imbalance Data

• 0’s

36

Imbalanced Datasets

• Definition: under or over representation of a class in a dataset is considered as an imbalance in a dataset.

• Ill-balanced, unbalanced, uneven

Balanced DatasetImbalanced Dataset

Balanced DatasetImbalanced Dataset 37

Graphic showing change under/over Sampling

Under Sampling Over Sampling

38

SMOTE’s Informed Oversampling Procedure

39

Smote: Synthetic

Minority Over-

sampling Technique

Lake Erie (2009-2011) Chlorophyll a & Microcystis Distributions

World Health Organization Guidance Values for Acute Health Effects of

Cyanobacteria-Dominated Waters *

* after Chorus & Bartram 1999

‘Moderate’ Risk(> 10-50 g L-1)

‘Low’Risk(< 10 g L-1)

‘High’ Risk(> 50 g L-1)

n = 62

PresenceAbsence

Continuous

A ‘two-step’ approach

(akin to GLiMMs)

‘High’ Risk(> 108 cells L-1)

‘Moderate’Risk(> 2 x 107 cells L-1)

‘Very High’

Risk(> 1010 cells L-1)

‘Low’ Risk(< 2 x 107 cells L-1)

40

Training &

Cross Validation (class imbalance

corrected via

SMOTE)

Absent Present

Absent130 10

Present8 151

Total 299

Test

ApplicationAbsent Present

Absent 10 7

Present 4 65

Total 86

Accuracy (% correct) - 93.98

% Absent Correct – 94.20

% Present Correct – 93.79

Accuracy (% correct) - 89.0

% Absent Correct – 71.43

% Present Correct – 90.23

Lake Erie Microcystis (Presence-Absence MLP); Hydrological & Meteorological

HLs: 29-15-10-5-1

0

1.0

0.8

0.2

0.6

0.4

Mic

rocy

sis

Occ

urr

en

ce P

rob

ab

ilit

yB)

Presence

Absence

TPP-A = 43 g L-1

TEMPP-A = 21 oC

WndSpdP-A = 19 km hr-1

Concentrations / Conditions for

Occurrence Likelihood of Microcystis:

41

Visualizing Predictive Variances & Uncertainties for Microcystis (Continuous)

5.5

11.0

9.9

6.6

8.8

7.7

10.0

32.0

27.6

14.4

23.218.8

225

0

135

4590

180

Mod

eled

Mic

rocy

stis

(Log

10

m3

L-1

)

5.5

11.0

9.9

6.6

8.8

7.7

Total Phosphorus (g L-1)22513545 90 1800

Microcystis as a function of TP & TEMP

Microcystis as a function of TP (by TEMP slices)

TP50 = 50.4 g L-1

TEMP50 = 26.3 oC

WndSpd(-3D)50 = 23.8 km hr-1

Mod

eled

Mic

rocy

stis

(Log

10

m3

L-1

)

5.5

11.0

9.9

8.8

6.6

7.7

Half-Maximal Abundance

Concentrations / Conditions:

42

So …. Why Care About This Data ‘Stuff’ ?

‘Paerl-o-gram’ courtesy of

Hans Paerl, Univ. North Carolina - Chapel Hill

This is were we are TODAY!

43

Machine-

learning

analytics

Machine-learning algorithms capable of autonomously unearthing and

reproducing complex patterns within sizeable data quantities afford great

potential for fueling ecological hypothesis creation and ‘intelligent’

knowledge derivation (here, ‘Robo-ecology’).

Still more effort to develop and investigate new ideas

44

The Role of Artificial Neural Networks in … Role of Artificial Neural Networks in Understanding...

Documents

Transcript of The Role of Artificial Neural Networks in … Role of Artificial Neural Networks in Understanding...