A hybrid expert system and neural network approach to...

9
HydroGIS 96: Application of Geographic Information Systems in Hydrology and Water Resources Management (Proceedings of Iht Vienna Conference, April 1996). IAHS Publ. no. 235, 1996. 685 A hybrid expert system and neural network approach to environmental modelling: GIS applications in the RAISON system DAVID LAM National Water Research Institute, Environment Canada, PO Box 5050, Burlington, Ontario L7R 4A6, Canada DAVID S WAYNE Department of Computing and Information Science, University of Guelph, Guelph, Ontario NIG 2W1, Canada Abstract Decision support systems for. solving environmental problems require more than a GIS. They need other components such as statistical trend analysis, graphic visualization, classification schemes, models and optimization procedures. These components must be linked to the GIS and with each other so that data and information can be easily stored, retriev- ed, analysed and presented. In this paper, a generic decision support system shell RAISON was used to streamline the operation and linkages of these components, to select and run models based on expert rules and to fill data gaps using neural networks. Examples of these applications were presented for environmental problems that require deep knowledge of hydrological sciences such as acid rain and watershed management. INTRODUCTION Environmental sciences are multidisciplinary in nature. Solutions to environmental problems require the integration of data, models and knowledge in hydrology, ecology and other disciplines. The use of GIS is an obvious tool, e.g. to overlay hydrological data with landuse planning. However, as decision support tools evolve, more components are required. Data visualization techniques are needed to help determine the geographical significance of data points in statistical relationships. There is the need for special grouping and classification schemes to help build management strategies, as well as methodologies to deal with spatial uncertainties such as fuzzy boundaries in data and error propagation in models. These requirements call for generic software that can provide the necessary analytical tools not readily available on GIS's. An example of a generic toolkit software is the RAISON (Regional Analysis by Intelligent Systems ON microcomputers) System developed by Environment Canada over the past 10 years (Lam & Swayne, 1993). It has evolved from practical concerns for an affordable and working system that can help design environmental information systems and decision support systems for both developed and developing countries. Originally it was developed for acid rain impact analysis (Lam et al, 1989) and subsequently applied to many environmental applications such as watershed manage- ment, sewage outfall, mine effluent, ecological indicator classification and state of environment reporting.

Transcript of A hybrid expert system and neural network approach to...

Page 1: A hybrid expert system and neural network approach to ...hydrologie.org/redbooks/a235/iahs_235_0685.pdf · system via local or wide area networks into the RAISON system. RAISON provides

HydroGIS 96: Application of Geographic Information Systems in Hydrology and Water Resources Management (Proceedings of Iht Vienna Conference, April 1996). IAHS Publ. no. 235, 1996. 685

A hybrid expert system and neural network approach to environmental modelling: GIS applications in the RAISON system

DAVID LAM National Water Research Institute, Environment Canada, PO Box 5050, Burlington, Ontario L7R 4A6, Canada

DAVID S WAYNE Department of Computing and Information Science, University of Guelph, Guelph, Ontario NIG 2W1, Canada

Abstract Decision support systems for. solving environmental problems require more than a GIS. They need other components such as statistical trend analysis, graphic visualization, classification schemes, models and optimization procedures. These components must be linked to the GIS and with each other so that data and information can be easily stored, retriev­ed, analysed and presented. In this paper, a generic decision support system shell RAISON was used to streamline the operation and linkages of these components, to select and run models based on expert rules and to fill data gaps using neural networks. Examples of these applications were presented for environmental problems that require deep knowledge of hydrological sciences such as acid rain and watershed management.

INTRODUCTION

Environmental sciences are multidisciplinary in nature. Solutions to environmental problems require the integration of data, models and knowledge in hydrology, ecology and other disciplines. The use of GIS is an obvious tool, e.g. to overlay hydrological data with landuse planning. However, as decision support tools evolve, more components are required. Data visualization techniques are needed to help determine the geographical significance of data points in statistical relationships. There is the need for special grouping and classification schemes to help build management strategies, as well as methodologies to deal with spatial uncertainties such as fuzzy boundaries in data and error propagation in models. These requirements call for generic software that can provide the necessary analytical tools not readily available on GIS's.

An example of a generic toolkit software is the RAISON (Regional Analysis by Intelligent Systems ON microcomputers) System developed by Environment Canada over the past 10 years (Lam & Swayne, 1993). It has evolved from practical concerns for an affordable and working system that can help design environmental information systems and decision support systems for both developed and developing countries. Originally it was developed for acid rain impact analysis (Lam et al, 1989) and subsequently applied to many environmental applications such as watershed manage­ment, sewage outfall, mine effluent, ecological indicator classification and state of environment reporting.

Page 2: A hybrid expert system and neural network approach to ...hydrologie.org/redbooks/a235/iahs_235_0685.pdf · system via local or wide area networks into the RAISON system. RAISON provides

686 David Lam & David Swayne

A teamwork approach which considers not just the software development but also the user participation has led to a generic operational framework (Fig. 1) for the system. It offers a user friendly interface to bring in and integrate data, text, maps, satellite images, pictures, video, models and other knowledge input. The system provides the user with a library of software functions and tools, e.g. algorithms, models, optimization procedures, expert systems and neural network. They can be used to design solution packages that produce customized interfaces and output such as interpretation, advice, scenario tests, strategic analysis and policy recommendation. In this paper, we illustrate this broader view, from both the database and knowledge-base aspects, with examples from environmental applications of the RAISON System for Windows, which has been recently developed (Lam et al., 1995), based on previous versions (Lam & Swayne, 1993) for DOS and UNIX systems.

INPUT

data text maps photos video satellite image models knowledge

m

TOOLS

database rule base graphics statistics expert systems neural network uncertainty analysis optimization

• •

OUTPUT

interpretation integrated results presentation classification scenarios cost benefit risk analysis recommendations

Fig. 1 Framework of the RAISON decision support system.

DATABASE AND APPLICATIONS

The database component is the core of the system and has to work with other systems such as GIS, statistics, and analysis tools, before it can be connected to the knowledge-base. Often though, this basic core can generate simple but interesting applications by itself.

Database

An advantage of developing a decision support system under the Windows platform is the easiness to bring in data from many existing databases such as ACCESS and EXCEL. Data and maps can be brought in from these databases through the Windows system via local or wide area networks into the RAISON system. RAISON provides a starting procedure that creates the project structure for a given application so that data and maps will be brought in, linked and made available for that project. Figure 2 is an example that shows water quality data for the Atlantic region in Canada selected from an EXCEL file and moved directly or through the Windows' Clipboard into RAISON.

Page 3: A hybrid expert system and neural network approach to ...hydrologie.org/redbooks/a235/iahs_235_0685.pdf · system via local or wide area networks into the RAISON system. RAISON provides

A hybrid

expert system

and

neural netw

ork approach

to environmental

modelling

687

••il

Mi

Ui\

ft.-

i

'J' -1 3

:,i. -

-

-i

OJ xl

o

C

ta ta cal

o

o

H

jliUJll

i <isi

il

Page 4: A hybrid expert system and neural network approach to ...hydrologie.org/redbooks/a235/iahs_235_0685.pdf · system via local or wide area networks into the RAISON system. RAISON provides

688 David Lam & David Swayne

Data screening and visualization

Before data can be used by decision support systems, they generally require screening and analysis processes to determine the data quality and simple relationship to each other. For example, for the data imported from EXCEL, we can select several water quality parameters and view them according to commonly used statistical plots (Fig. 2) such as quartile to quartile (q-q) plots, percentile box plots and scattered plots. In RAISON, one can associate the data shown in these graphical plots with their locations on GIS maps. For example, a regression plot can be made between the water quality parameters sulphate and pH, bound by an envelope defining the 95% confidence level. One can compare data sites (Fig. 2, green dots on map) within this confidence level of the regression with those outside the envelope (red dots on map) to screen or group data.

Environmental statistics

A special set of statistical tools for use in environmental applications (e.g. normal and log-normal distribution analysis) and other methods such as principal component analysis, clustering analysis, multivariate regression and other techniques are also implemented in RAISON for Windows. These special functions can be combined with the data visualization tools to generate, for example, a cloud plot by grouping the principal component vectors and then showing their locations through thematic mapping. This combination of fast database retrieval, environmental statistics and GIS mapping leads to the effective testing of environmental guidelines and ecological indicators for the benthic biological community in the Great Lakes (Lam et al., 1995).

Temporal changes and relational databases

Hydrological studies often require special temporal event and trend analysis tools, particularly for site data with lengthy historical records. As these tools are generally not made available in GIS's, we supplement them as a special trend analysis component in RAISON. The trend analysis module includes seasonal and de-seasonalized trends, with graphics to show changing points and other outputs. These tools can be linked with the other modules in RAISON. For example, after highlighting a segment or branch of a river in the map, one can send a structured query language (SQL) to ask for the particulate phosphorus concentration data measured in the selected zone for summer time only and with pH > 7 under the condition that algal growth is observed. The data retrieval operation may require the search over time and space, as well as over parameter domain and keywords. When retrieved, further analysis can be carried out to group the phosphorus concentration by rainfall events (e.g. at 5 mm precipitation intervals) and show the results as statistical box plots (Lam et al., 1995).

KNOWLEDGE-BASES AND APPLICATIONS

Many decision support systems require a knowledge-base component to store rules, models and other forms of knowledge. Such a knowledge-base should not increase the

Page 5: A hybrid expert system and neural network approach to ...hydrologie.org/redbooks/a235/iahs_235_0685.pdf · system via local or wide area networks into the RAISON system. RAISON provides

A hybrid expert system and neural network approach to environmental modelling 689

burden of the users, but rather should simplify and streamline the domain information for the users to query the database intelligently and understand better the scientific processes. An effective way to present the knowledge, in addition to a working core of database and analysis components as described above, requires a set of knowledge processing tools as follows.

Modelling and integration

Environmental models, particularly hydrological and ecosystem models, have been used to represent the knowledge of scientific processes. As decision support tools, these models, when calibrated, verified and validated with observed data, can be used to make projections, scenarios, and cost-benefit analyses. Models can be incorporated into the RAISON System by: (a) using the executable codes as given through appropriate interfaces for the model input and output, (b) emulating the model by a simplified version such as an input-output model, and (c) rewriting the codes in a programming language (e.g. Visual Basic) acceptable for running under the Windows environment. For example, as shown in Fig. 3, the agricultural non-point source model, AGNPS, developed by Young et al. (1989) can be incorporated in the RAISON System. With a special grid editor (Fig. 3), a grid cell structure can be placed on a watershed map or soil map and the model parameters can be determined from database information such as topology, land use and soil types, etc. This information can then be edited, if necessary through a special table under RAISON for Windows, which is then made into an appropriate input DOS file for the AGNPS model. The model is then run under DOS with this input file and the output transferred back to RAISON for Windows.

Similarly, a trajectory model for tracking long range transport of airborne pollutants (e.g. sulphur dioxide) is incorporated into the RAISON system, first by rebuilding or emulating the model with a simpler input-output source-receptor model (Fig. 3). Then the output (e.g. sulphate deposition) of this air pollutant transport model is used as input to a surface water quality model which, in turn, computes the pH, alkalinity and other parameters. The water quality model has been rewritten in Visual Basic for effective execution under RAISON. The output of the water quality model is further used as input to an ecological impact model (e.g. fish species richness model), which in turn computes the distribution of lakes with different classes of species richness (Fig. 3). Thus, the linkage of different types of models is possible in RAISON, as long as their input and output are compatible. As a decision support tool, the combined model can be used in a optimization procedure to find, e.g. the minimum source emission reduction required to achieve a pre-set target level of ecological damage, subject to a given constraint on the industrial and economical production (Lam et al., 1995).

Expert system

Rule-based expert systems are useful tools to diagnose the outcomes for given environmental conditions (e.g. odour and taste of groundwater wells), to advise on scientific processes and knowledge limitation (e.g. the choice of models or model coefficients), and to classify data and information (e.g. guideline violations), etc. In the

Page 6: A hybrid expert system and neural network approach to ...hydrologie.org/redbooks/a235/iahs_235_0685.pdf · system via local or wide area networks into the RAISON system. RAISON provides

690 D

avid Lam

D

avid Swayne

Page 7: A hybrid expert system and neural network approach to ...hydrologie.org/redbooks/a235/iahs_235_0685.pdf · system via local or wide area networks into the RAISON system. RAISON provides

A hybrid expert system and neural network approach to environmental modelling 691

RAISON System, "if-then" rules can be entered in spreadsheet format. The variables (e.g. taste) used in these rules can be described by attributes (e.g. bitter) that can be qualitative, numerical, logical or a range of values. A simple interface enables the user to enter the variables, their attributes and the rules. A generic inference engine is then used to organize the rule base and to make appropriate conclusions or a course of actions (e.g. search the database for further information, display results on GIS maps, or even call another expert system).

A simple application of this expert system module is given in Lam et al. (1989) which used expert rules to select and combine model results for acid rain impact analysis. The original rule set in this application was a crisp set, i.e. each rule is based on decision using precise values of the attributes, without allowing for fuzzy conditions. For example, one of the rules says if the TD (Trickle Down) Model result is greater or equal to 0.1 meq l"1 and the CDR (Cation Denudation Rate) Model result is less than 0.1 meq l"1, then the expert system result is the average of the two model results. The crispness of this rule dictates that the attribute value of 0.1 meg l"1 has to be treated precisely, although the experts who advised us may intend to mean "roughly speaking" or "in the neighbourhood of 0.1 meq I"1". Using RAISON for Windows, we improve this rule set by allowing for fuzziness. For example, the user can design from a graphical interface a smooth continuous membership function to describe condition before and after the critical value of 0.1 meq l"1 (as compared to a sharp discontinuity in the crisp rule set). The interaction of fuzzy variables and the outcome are then subjected to a choice of available "defuzzification" procedures, e.g. the minimum-maximum rules of implication (Wong et al, 1995). In other words, different membership functions can intersect and combine in many ways and the defuzzification procedure eventually produces a value that satisfies the rules but allows for a smooth transition of outcome in the neighbourhood of the critical threshold.

Neural network and data/knowledge gaps

Neural network is a knowledge generation tool in the sense that it uses algorithms in the same framework as the human neurons to transmit and process information, i.e. detect patterns and learn by examples. It differs from the expert system which requires the knowledge to be known a priori and provided by an expert. In the neural network approach, data or examples are used to "train" the network so that subsequently when similar conditions arise it "recalls" the knowledge and predicts the consequences. The neural network component implemented in RAISON for Windows is based on a varia­tion of the feed-forward, back-propagation algorithm (Wong et al, 1995). This algorithm is selected for its capability in detecting patterns and making predictions, particularly for matching missing data from known cases. It consists of an input data layer which feeds information in a parallel and somewhat redundant mode (as in human neurons) into a so-called hidden layer, with a set of connection weights that are continually adjusted. Similarly, the hidden layer can further feed the weighted messages to other hidden layers, and eventually supply the outcome to the output layer. In the training mode, examples of data with known input and expected output are used. The "feed-forward" output works in conjunction with a "backward" error propagation to adjust the weights repeatedly until a tolerance level is reached. In the predictive mode,

Page 8: A hybrid expert system and neural network approach to ...hydrologie.org/redbooks/a235/iahs_235_0685.pdf · system via local or wide area networks into the RAISON system. RAISON provides

692 David Lam & David Swayne

new input data are entered and the output generated by using the weights determined during the training mode.

As an example, Fig. 4 shows the neural network, with four input nodes and one hidden layer with two nodes, that was used to generate missing data of dissolved organic carbon (DOC) based on information from four other water quality parameters (colour, aluminium, iron and ammonium) for different regions or aggregates of watersheds in Eastern Canada. Since DOC is an important variable used in the expert rule-base for acid rain analysis discussed earlier, the use of the neural network results would help sharpen the expert system outcomes. When compared to observed alkalinity data from four aggregates totalling 568 sites in Eastern Canada, the original crisp rule-base prediction produced an aggregate median of relative errors from 15% to 18.7% and regression slopes of 0.8 to 1.16; the fuzzy expert system without the neural network input yielded 14.7% to 18.2% and 0.8 to 1.15; and the expert system with fuzzy logic and neural network input gave 14.3% to 15.2% and 0.83 to 1.15. Thus, the hybrid neural network and expert system approach led to smaller errors and better fit with observed data.

Fig. 4 Neural network configuration used for filling data gaps.

FUTURE WORK

Currently we are working on further development of the RAISON for Windows by using other information technologies such as genetic algorithms, causal network and uncertainty propagation. Presentation tools including multimedia and 3-dimensional graphics (Fig. 3) are also being considered. These techniques, which can readily incorporate into the system and connected to GIS and other components, are required by many applications.

Acknowledgements The authors are indebted to the contributions and efforts by the RAISON development team, especially to I. Wong, D. Kay, K. Brown, P. Fong,

Page 9: A hybrid expert system and neural network approach to ...hydrologie.org/redbooks/a235/iahs_235_0685.pdf · system via local or wide area networks into the RAISON system. RAISON provides

A hybrid expert system and neural network approach to environmental modelling 693

A. Storey, J. Storey and J. McNeil. They are thankful for input from research partners involved in the Duffin's Creek Study and the Integrated Assessment Modelling project. Windows, EXCEL, ACCESS and Visual Basic are products of Microsoft Corp., Redmond, Washington, USA. DOS and UNIX are operating systems developed by several vendors, e.g. MS-DOS by Microsoft Corp., Redmond, Washington, USA and UNIX(AT&T) by AT&T Corp., New York, USA.

REFERENCES

Lam, D. C. L. &Swayne, D. A. (1993) An expert system approach of integrating hydrological databases, models and GIS: application of the RAISON system. In: Application of Geographic Information Systems in Hydrology and Water Resources Management (ed. by K. Kovar & H. P. Nachtnebel) (Proc. Int. Conf. HydroGIS'93, Vienna, April 1993), 23-34. IAHSPubl.no. 211.

Lam, D. C. L., Booty, W. G., Wong, I., Kay, D. & Kerby, J. (1995) RAISON decision support system: examples of applications. NWRI Contribution Report no. 95-151, Burlington, Ontario, Canada.

Lam, D. C. L., Swayne, D. A., Storey, J. & Fraser, A. S. (1989) Watershed acidification using the knowledge-based approach. Ecol. Modelling 47, 131-152.

Wong, I., Lam, D. C. L., Fong, P., Storey, A. & Swayne, D. A. (1995) A neuro-fuzzy system for acidic deposition modelling. In: Proc. Int. ICSC Symp. Fuzzy Logic (ed. by N. C. Steele), C, 136-141. ICSC Academic Press, Zurich, Switzerland.

Young, R. A., Onstad, C. A., Bosch, D. D. & Anderson, W. P. (1989) AGNPS: a non-point source pollution model for evaluating agricultural watersheds./. SoilandWat. Conserv. 44, 168-173.