Transactions In Gis

download Transactions In Gis

of 20

Transcript of Transactions In Gis

  • 8/13/2019 Transactions In Gis

    1/20

    Transactions in GIS, 2003, 7(4): 447466

    2003 Blackwell Publishing Ltd, 9600 Garsington Road, Oxford OX4 2DQ, UK and350 Main Street, Malden MA 02148, USA.

    Research Article

    Building Web-Based Spatial InformationSolutions around Open Specificationsand Open Source Software

    Geoffrey Anderson

    Cloudshadow Consulting, IncBoulder, Colorado

    Rafael Moreno-Sanchez

    Department of Geography

    University of Colorado at Denver

    Abstract

    Geographic Information Systems (GIS) are moving from isolated, standalone,

    monolithic, proprietary systems working in a client-server architecture to smaller

    web-based applications and components offering specific geo-processing functionality

    and transparently exchanging data among them. Interoperability is at the core of

    this new web services model. Compliance with Open Specifications (OS) enables inter-operability. Web-GIS softwares high costs, complexity and special requirements

    have prevented many organizations from deploying their data and geo-processing

    capabilities over the World Wide Web. There are no-cost Open Source Software (OSS)

    alternatives to proprietary software for operating systems, web servers, and Relational

    Database Management Systems. We tested the potential of the combined use of OS

    and OSS to create web-based spatial information solutions. We present in detail the

    steps taken in creating a prototype system to support land use planning in Mexico with

    web-based geo-processing capabilities currently not present in commercial web-GIS

    products. We show that the process is straightforward and accessible to a broad audience

    of geographic information scientists and developers. We conclude that OS and OSSallow the development of web-based spatial information solutions that are low-cost,

    simple to implement, compatible with existing information technology infrastructure,

    and have the potential of interoperating with other systems and applications in the future.

    1 Introduction

    With a few exceptions (e.g. the Geographic Resources Analysis Support System GRASS;

    http://www.cecer.army.mil/grass/GRASS.main.html), geographic information technology

    Address for correspondence: Rafael Moreno-Sanchez, Department of Geography, University ofColorado at Denver, Campus Box 172, P.O. Box 173364, Denver, CO 80217-3364, USA. E-mail:[email protected]

    http://www.cecer.army.mil/grass/GRASS.main.htmlhttp://www.cecer.army.mil/grass/GRASS.main.html
  • 8/13/2019 Transactions In Gis

    2/20

    448

    G Anderson and R Moreno-Sanchez

    Blackwell Publishing Ltd. 2003

    has developed as isolated, standalone, monolithic, proprietary systems. This is rapidly

    changing as geo-processing principles and functionality are moving out of a tightly

    defined niche into the information technology (IT) mainstream. Isolated, standalone

    systems are being replaced by integrated components, and large applications are being

    replaced by smaller, more versatile applications that work together transparently acrossnetworks. Of these, the World Wide Web (WWW or the web) is becoming the core

    medium for distributed computing in IT generally and in the geo-processing domain

    specifically (Hecht 2002b). In other words, Geographic Information Systems (GIS), once

    focused on data and tools implemented with client-server architecture, now are evolving

    to a web services model (Dangermond 2002). In this new architecture the web is used for

    delivering not just data, but geo-processing functionality that can be wrapped in inter-

    operable software components called web services. These components can be plugged

    together to build larger, more comprehensive services and/or applications (Hecht 2002c).

    Interoperability between heterogeneous environments, systems and data is fundamental

    for the implementation of this web services model.

    With respect to their IT infrastructure, organizations aim to: (1) maximize productivity

    and efficiency; (2) protect critical information infrastructure; and (3) overcome prob-

    lems related to data sharing, security and data maintenance, as well as software special

    requirements and steep learning curves. The WWW offers the potential benefits of

    flexibility, ubiquity, and reduced costs and risks of obsolescence and isolation. However,

    when organizations try to use the web as a platform to deliver geographic data and provide

    geo-processing functionality to their end users, they commonly find that commercial

    web-GIS software raises the following issues: (1) it does not currently offer out of the

    box geo-processing functionality to perform many of the analyses demanded by their

    end users; (2) it is expensive; (3) it has a steep learning curve; (4) it requires that some

    of their IT personnel become specialists in the software operation and maintenance; and

    (5) it is difficult to integrate with existing IT infrastructure (personnel skills, software

    and applications).

    The use of OS and Open Source Software (OSS) offer the potential to overcome the

    abovementioned issues and facilitate the deployment of geographic data and geo-processing

    functionality on the WWW. There is a growing interest in the use of OS. For example,

    the British Ordnance Survey is using the Geographic Markup Language (GML) OS to

    deliver the Digital National Framework on the web and to mobile devices (Holland 2001;

    http://www.ordinancesurvey.co.uk/dnf/home.htm). According to Lowe (2002), there is

    a growing market for OSS products fed by small organizations and regional government

    agencies that cannot afford proprietary softwares (web-GIS, DBMS and web servers)

    costs, complexity, steep learning curves, training costs, and special requirements. There

    are already several successful examples of the use of OSS to create basic web-mapping

    functionality (see cases described by Lowe 2002 and Ramsey 2002).

    In spite of this growing interest, little has been published about the combined use

    of OS and OSS for the creation of web-based geo-processing applications. Even less is

    found in the form of detailed explanations of how they can interact and complement

    each other to create these applications. This article aims to contribute to the knowledge

    base about OS and OSS, and how they can be used to create web-based spatial informa-

    tion solutions. We demonstrate the process through a case study in which we created

    a prototype system to support land use planning in central Mexico. The system imple-ments geo-processing functionality currently not available out of the box in commercial

    web-GIS software. The article is organized as follows: section 2 defines OS, OSS and

    http://www.ordinancesurvey.co.uk/dnf/home.htmhttp://www.ordinancesurvey.co.uk/dnf/home.htm
  • 8/13/2019 Transactions In Gis

    3/20

    Web-Based Spatial Information Solutions

    449

    Blackwell Publishing Ltd. 2003

    interoperability, and provides the necessary background regarding the organizations and

    efforts to create OS; section 3 defines and provides a brief background of the specific OS

    and OSS technologies used to create the web-based spatial information system described

    in this paper; section 4 presents a brief background about the need for the system and

    a detailed explanation of the process followed to create a prototype web-based spatialinformation system with querying and Boolean-intersect overlay geo-processing capab-

    ilities to support land use planning in central Mexico; section 5 presents a discussion

    of the implications and difficulties in applying these technologies; and finally, section 6

    presents conclusions and suggestions for future research and implementations.

    2 Defining Open Specifications (OS), Interoperability, and Open SourceSoftware (OSS)

    Open Specifications provide software engineers and developers information about a given

    specification as well as specific programming rules and advice for implementing the

    interfaces and/or protocols that enable interoperability between systems. The OpenGIS

    Consortium Inc. (OGC) (http://www.opengis.org)defines interoperability as the ability

    for a system or components of a system to provide information portability and inter-

    application cooperative process control. In the context of the OGC specifications this

    means software components operating reciprocally (working with each other) to overcome

    tedious batch conversion tasks, import/export obstacles, and distributed resource access

    barriers imposed by heterogeneous processing environments and heterogeneous data.

    Herring (1999) and Kottman (1999) present an in-depth discussion of the OpenGIS

    Data Model and the OGC process for the creation of OS respectively. Software products

    can be submitted for testing their interfaces for compliance with OGC OpenGIS Imple-

    mentation Specifications (see http://www.opengis.org/techno/implementation.htm for

    the most recent approved and in process specifications). Initially, the only OpenGIS

    Specifications that products could conform to were the OpenGIS Simple Features

    Specifications for CORBA, OLE/COM and SQL (McKee 1998), but there are now 11

    different specifications. Within computer environments there are many different aspects

    of interoperability (Vckovski 1998): (1) independent applications running on the same

    machine and operating system, i.e. interoperability through a common hardware

    interface; (2) application A reading data written by another application B, i.e. inter-

    operability through a common data format; and (3) application A communicating with

    application B by means of interprocess communication or network infrastructure,

    i.e. interoperability through a common communication protocol. Besides technical issues,

    there are also interoperability topics at higher levels of abstraction such as semantic

    barriers (Harvey 1999, Seth 1999). A system based on the OS described in a later section

    of this article would be able to achieve a level of interoperability of the second above-

    mentioned type.

    According to Hecht (2002b) interoperability is desirable for the following reasons:

    (1) it allows for communication between information providers and end users without

    requiring that both have the same geo-processing or viewer software; (2) no single

    Geographic Information System (GIS), mapping tool, imaging solution or database answers

    every need; (3) there are large numbers of database records with a description of loca-tion that have the potential to become spatial data, and also, advances in several tech-

    nologies (e.g. GPS integrated into mobile devices) are increasing the number of database

    http://www.opengis.org/http://www.opengis.org/techno/implementation.htmhttp://www.opengis.org/techno/implementation.htmhttp://www.opengis.org/
  • 8/13/2019 Transactions In Gis

    4/20

    450

    G Anderson and R Moreno-Sanchez

    Blackwell Publishing Ltd. 2003

    records with location information; (4) the number of software companies offering

    components to deal with geographic information is growing; (5) it is more efficient to

    collect data once and maintain them in one place (this is particularly cost effective if

    communities of users can find, access and use the information online, so they do not

    need to access, retrieve and maintain whole files and databases of information for whichothers are responsible); (6) the ability to seamlessly combine accurate, up-to-date data

    from multiple sources opens new possibilities for improved decision making and makes

    data more valuable; and (7) the ability for multiple users, including non-GIS experts, to

    use a particular set of data (perhaps at different levels with different permissions) also

    makes the data more valuable. Gardels (1997) discusses how compliance with OGCs

    OpenGIS specifications and the resulting interoperability can contribute to integrating

    distributed heterogeneous environments into on-line environmental information systems

    (EIS). He points to three technical strategies (federation, catalogs and data mining) for

    the integration of these systems, and how they are heavily dependent on interoperability

    among diverse data sources, formats and models. He concludes that properly designed

    geodata access and analysis tools, combined with open environmental information systems,

    can provide sophisticated decision support to the users of geographic information.

    Two organizations have been coordinating the development of the open specificat-

    ions used in this paper: the OpenGIS Consortium Inc. (OGC) (http://www.opengis.org)

    and the World Wide Web Consortium (W3C) (http://www.w3.org). The W3C has created

    more than forty technical specifications (http://www.w3.org/TR/) and as of January

    2002, the OGC has adopted nine OpenGIS Implementation Specifications and 11 candidate

    specifications are in the works (Hecht 2002a; a roadmap to the specifications work is

    presented at http://www.opengis.org/roadmap/index.htm).

    Briefly, Open Source Software (OSS) are programs whose licenses give users the free-

    dom to run the program for any purpose, to modify the program, and to freely redistribute

    either the original or modified program without further limitations or royalty payments

    (http://www.opensource.org/docs/definition.php). Among the most well known OSS

    projects are the Linux operating system and Apache web server. Sometimes the term

    Open Technologies is used to refer to these projects and others such as XML, HTML,

    TCP/IP, and Java technology. A comprehensive list of GIS-related OSS can be found at

    http://opensourcegis.org/. According to Wheeler (2002), OSS reliability, performance,

    scalability, security and total cost of ownership are at least as good or better than its pro-

    prietary competition, and under certain circumstances, they are a superior alternative to

    their proprietary counterparts.

    3 Background on the Specific OS and OSS Used to Create a Web-based SpatialInformation System

    This section provides background information about the origin and relationships among

    the OS and OSS we used. We also point to their relevance for the creation of web-based

    geo-processing functionality.

    The Extensible Markup Language (XML) is a subset of the Standard Generalized

    Markup Language (SGML) [ISO 8879] (http://www.w3.org/TR/1998/). XML uses

    pairs of text-based tags, enclosed in parentheses, to describe the data. These tags makethe information passed across the Internet self describing (Waters 1999). Part of its

    success comes from: (1) the fact that it can be read and written by humans (in contrast

    http://www.opengis.org/http://www.w3.org/http://www.w3.org/TR/http://www.opengis.org/roadmap/index.htmhttp://opensourcegis.org/http://www.w3.org/TR/1998/http://www.w3.org/TR/1998/http://opensourcegis.org/http://www.opengis.org/roadmap/index.htmhttp://www.w3.org/TR/http://www.w3.org/http://www.opengis.org/
  • 8/13/2019 Transactions In Gis

    5/20

    Web-Based Spatial Information Solutions

    451

    Blackwell Publishing Ltd. 2003

    with binary formats) and thus provides a single way of representing structure regardless

    of whether the information is intended for human or machine consumption; and (2) its

    similarity to the widely used Hyper Text Markup Language (HTML). XML satisfies

    two compelling requirements, firstly it separates data from presentation, and secondly, it

    transmits data between applications. XML is a metalanguage, i.e. a language thatdescribes other languages (Boumphrey et al

    .

    1998; http://www.xml.com). These languages

    are called XML schemas (for a detailed definition of what constitutes a schema, and

    how new schemas can be created see Ducket et al. 2001). There are schemas for over

    40 different areas of expertise (http://www.xml.org/xml/registry.jsppresents a registry of

    XML schemas). In the web-based geo-processing arena, XML is being used to exchange

    metadata and control information between computers, and between them and humans.

    According to Aloisio et al

    .

    (1999), XML will play a major role in enabling computers

    to communicate universally with other computers, and to create a new generation of

    web services designed to interact with other services. They also reaffirm that XML is

    simple and powerful and its similarity to HTML ensures universal adoption.

    Scalable Vector Graphics (SVG) and the Geography Markup Language (GML) are

    XML schemas. The first is a vector graphics language written in XML to describe

    two-dimensional graphics. The second is an XML encoding for the transport and stor-

    age of geographic information, including both the spatial and non-spatial properties of

    geographic features. SVG is a W3C open specification (http://www.w3.org/TR/SVG/).

    GML is an OGC open specification (http://www.opengis.net/gml/ 02-069/GML2-12.html).

    In SVG the graphical elements are represented within XML tags, hence SVG offers

    all the advantages of XMLs openness, transportability, and interoperability (Eisenberg

    2002). SVG drawings can be dynamic and support embedded interactivity, animation,

    embedded fonts, XML code, Cascading Style Sheets and scripting languages. A rich set

    of event handlers such as onmouseover and onclick can be assigned to any SVG graph-

    ical object. For example, we used the onmouseover event to show real-world coordinates

    as the user moves the mouse over the SVG map. SVG is capable of using real world

    coordinate systems in contrast to other popular vector graphics formats such as Macro-

    media Flash (Neumann 2002 compares the capabilities of SVG and Flash to handle

    vector graphics in web applications). All these features make SVG appealing for the

    graphical representation of geographic data on the web (Gould and Ribalaygua 1999).

    Puhretmair and Woss (2001) used dynamically generated SVG maps as an intuitive

    interface to present tourist information contained in distributed data sources. The informa-

    tion is distributed among several servers and websites and is structured in different ways.

    XML is used to create query tools and integrate the data by communicating with the

    different services. Then the SVG capabilities to support embedded interactivity, anima-

    tion, embedded fonts, XML, Cascading Style Sheets and scripting languages are used to

    create on the fly maps as response to queries. The SVGOpen Conference is an excellent

    source for the growing field of SVG applications (http://www.svgopen.org).

    Lake (2001a) briefly presents the organization of the GML specification. In GML

    the geometries and attributes of geographic layers are represented within XML tags,

    again, this brings forth all the advantages of XMLs openness, transportability and

    interoperability. GML is designed to support interoperability and does so through the

    provision of basic geometry tags (all systems that support GML use the same geometry tags),

    a common data model (features/properties), and a mechanism for creating and sharingapplication schemas (see the GML 2.1.2 specification at http://www.opengis.net/gml/

    02-069/GML2-12.html). GML conforms to the OGCs Simple Features specificiations

    http://www.xml.com/http://www.xml.org/xml/registry.jsphttp://www.w3.org/TR/SVG/http://www.opengis.net/http://www.svgopen.org/http://www.opengis.net/gml/http://www.opengis.net/gml/http://www.svgopen.org/http://www.opengis.net/http://www.w3.org/TR/SVG/http://www.xml.org/xml/registry.jsphttp://www.xml.com/
  • 8/13/2019 Transactions In Gis

    6/20

    452

    G Anderson and R Moreno-Sanchez

    Blackwell Publishing Ltd. 2003

    and it is not concerned with the visualization of geographic features such as the drawing

    of maps. Hence, we used SVG for the graphical representation of the data in GML

    format. GML is as critical to the evolution of the geospatial infrastructure on the web

    as HTML was to the development of the conventional Internet (Lake 2001b). GML

    supports geospatial interoperability in various ways (Lake 2001a), firstly it provides acommon schema framework for the expression of geospatial features; secondly it pro-

    vides a common set of GML geometry types, this allows authors of different schemas

    to share the same mechanisms for geometry description and hence be able to interpret

    the correspondence between the schemas when they are referring to the same feature in

    the real world; and third, the definition and publication of GML schemas that can be

    shared across communities of interest such as transportation, environmental issues,

    petroleum exploration, etc. facilitates interoperability on the semantic level.

    XLST (Extensible Stylesheet Language: Transformations) is one of three parts that

    compose a bigger language called XSL (Extensible Stylesheet Language). XLST is a W3C

    open specification (http://www.w3.org/Style/XSL/). Essentially it is an XML based

    language for transforming the structure of XML documents for display on screen, on

    paper, or spoken word (Kay 2001). In addition, XLST is commonly used to transform

    data from one data model (e.g. text) in one application to the data model used in

    another (e.g. SQL statements to create a table in a Relational Database Management

    System). The XLST formatting code contained in a text file is known as a Style Sheet.

    XML documents are commonly processed through parsing. Geographic data in

    GML format tend to be huge text files (Sahay 2002), therefore it is critical to use the

    most efficient parsing method to process them. The SAX (Simple API for XML) parsing

    method has been proven to be more efficient than its alternative DOM (Document Object

    Model) method for processing GML documents (Sahay 2002). The use of SAX results

    in reduced memory overhead compared to DOM, which requires the retention of the

    complete document as a tree in memory. In our application, we used the SAX method

    to extract the geometries from the geographic layers in GML format and convert them

    to a format more amenable to spatial analysis such as Java2D objects.

    The Java2D Application Programming Interface (API) is part of the Java Develop-

    ment Kit (JDK). It is used for manipulation of two-dimensional objects. The Java2D

    API includes the Constructive Area Geometry Methods for the Boolean overlay operations

    intersection, union, subtraction and exclusive-OR (http://java.sun.com/products/java-

    media/2D/). The JDK is free and includes a Java2D demonstration.

    PHP (acronym derived from its origin as Personal Home Page Tools) is a server-side,

    HTML-embedded, cross-platform scripting language (Rasmus 2000; http://www.php.net/ ).

    It borrows concepts from other common languages such as C and Perl. PHP provides a

    way to put instructions into HTML files to create dynamic content. The developer can

    embed PHP structured code (e.g. loops, conditionals, rich data structures) inside HTML

    tags. PHP is an OSS. We used it on the server side for process control, processing of the

    users input, and to invoke and pass parameters to applications.

    PostgreSQL is a sophisticated Object-Relational Database Management System

    (RDBMS), supporting almost all SQL constructs, including subselects, transactions,

    and user-defined types and functions. It is the most advanced OSS database available

    today (Stinson 2001; Stones and Matthew 2001; http://www.postgreSQL.org/).

    PostGIS,

    which is also an OSS, is an extension of the PostgreSQL RDBMS that adds support forgeographic objects (http://postgis.refractions.net/ ). In effect, PostGIS spatially enables

    the PostgreSQL server, allowing it to be used as a backend spatial database for Geographic

    http://www.w3.org/Style/XSL/http://java.sun.com/products/javamedia/2D/http://java.sun.com/products/javamedia/2D/http://www.php.net/http://www.postgresql.org/http://postgis.refractions.net/http://postgis.refractions.net/http://www.postgresql.org/http://www.php.net/http://java.sun.com/products/javamedia/2D/http://www.w3.org/Style/XSL/
  • 8/13/2019 Transactions In Gis

    7/20

    Web-Based Spatial Information Solutions

    453

    Blackwell Publishing Ltd. 2003

    Information Systems (GIS), much like ESRIs Spatial Database Engine (SDE) or Oracles

    Spatial extension. PostGIS follows the OGC Simple Features Specification for SQL

    (http://www.opengis.org/techno/implementation.htm).

    MapServer is an OSS development environment for building basic web-mapping

    applications (http://mapserver.gis.umn.edu/). It facilitates the display and browsing ofgeographic data in commonly used vector and raster formats. It is not designed to be a

    full-featured GIS system and hence it does not offer geo-processing functions.

    Linux

    is a free Unix-type operating system. Specifically, we used the Red Hat

    Linux distribution (http://www.redhat.com). The Apache web server is a Hyper Text

    Transfer Protocol (HTTP) compliant web server. It is an OSS maintained by the Apache

    HTTP Server Project (http://www.apache.org). As of August 2002, 63% of the web sites

    on the Internet are run on the Apache web server (Netcraft Web Server Survey; http://

    www.netcraft.com/survey/).

    4 The Case Study Creating a Prototype Web-based Spatial InformationSystem to Support Land Use Planning in Central Mexico

    4.1 The need for the web-based system

    During the decade of the 90s, the National Institute for Forest, Agriculture and Livestock

    Research (INIFAP) in Mexico started to use GIS as part of its land suitability studies.

    The largest of these studies was a strategic level national land use planning project to

    identify the areas with potential to grow specific crops, forage and forestry species

    considered of economic relevance for the country. A spatial database of national cover-

    age was created to support this study. The database is organized in the following layers:

    soils (digitized from 1:50,000 scale maps with information about primary and second-

    ary soil type, and presence of chemical or physical phases), a digital elevation model at

    30 meters resolution (elevation and slope are derived from it) and several climate layers

    (30-year monthly and annual averages for minimum and maximum temperatures, pre-

    cipitation and evaporation). These layers of information can be combined to derive

    other parameters that help to estimate the suitability of an area for a specific crop, such

    as the evaporation/precipitation coefficient. For each agricultural, forage and forestry

    species considered of strategic importance for the country, INIFAPs researchers com-

    piled a list of the values of the environmental factors (soil type; slope; precipitation;

    maximum, minimum temperatures; etc.) that are considered ideal for the growth of the

    particular species. Together with this information they compiled what is called a tech-

    nological package which is a handbook of best practices for the production of the species

    in question. State and federal government agencies, researchers, land owners, seed com-

    panies, entrepreneurs, and agricultural insurance companies, among others, can request

    the identification of the areas that fulfill the environmental factors for the production of

    the species of their interest. These areas are identified by querying the spatial database

    layers for the specific range of values considered ideal for the species in question. These

    query results are overlaid using a Boolean intersect operation to find the areas where all

    the desired environmental parameters are present.

    After the completion of the first studies, INIFAP created specialized units to pro-

    vide service to users of this information. This service is peripheral to INIFAPs researchresponsibilities. However, soon after starting this service the units were overwhelmed

    by requests to identify areas that meet specific environmental requirements. INIFAP is

    http://www.opengis.org/techno/implementation.htmhttp://mapserver.gis.umn.edu/http://www.redhat.com/http://www.apache.org/http://www.apache.org/http://www.redhat.com/http://mapserver.gis.umn.edu/http://www.opengis.org/techno/implementation.htm
  • 8/13/2019 Transactions In Gis

    8/20

    454

    G Anderson and R Moreno-Sanchez

    Blackwell Publishing Ltd. 2003

    in need of an alternative way to fulfill these demands in a more timely, efficient and

    economical way.

    In the past INIFAP had considered using the web as a platform to serve their geographic

    data or selected pieces of it, and to perform the previously described geo-processing

    analyses. However, several factors had deterred them from further pursuing this option:(1) the required Boolean overlay geo-processing capabilities currently do not exist out

    of the box in commercial web-GIS systems; (2) the high costs of web-GIS software; (3) the

    special requirements of this software in terms of dedicated personnel and lengthy train-

    ing; and (4) concerns about the compatibility of the web-GIS software with existing

    IT infrastructure (personnel skills, software and applications). We decided to test the

    viability of using OS and OSS technologies to create a web-based spatial information

    system for non-expert users that will overcome these issues. The aim of the system is to

    allow end users to perform queries for desired values of environmental factors and do

    Boolean overlays of the results of these queries to identify the areas where all the desired

    environmental factors are present. By changing the values in their queries end users can

    have a quick idea of the effects of these changes on the areas selected and their intersection.

    4.2 Creating the prototype system

    The state of Guanajuato in central Mexico was selected as a pilot area to create a proto-

    type system. INIFAP stated the following preferences regarding the design and development

    strategy for the system. It would: (1) easily integrate with the remainder of its existing

    IT infrastructure (personnel skills, software and applications); (2) be scalable, with initial

    low costs and low total costs of ownership of the system over the long run; (3) minimize

    special requirements; (4) not imply steep learning curves; and (5) eventually improve the

    efficiency in the expansion, maintenance and quality control of the national database by

    centralizing these functions in one place.

    Next we present the step-by-step process to create a prototype web-based spatial

    information system around OS and OSS that is capable of processing attribute queries

    and Boolean intersection overlays. The prototype system is based on a PC computer with

    average technical specifications and a fast (T1) Internet connection. It runs the Linux

    (specifically Red Hat Linux) operating system and the Apache web server. For illustra-

    tion purposes in this article, we have taken a subset of the data and translated the

    interface prompts. This demo system and all the referenced source code and relevant

    web links can be found at http://206.168.217.254/guanajuato/.The demo on this website

    has detailed instructions of how to perform queries, intersect overlays and display the

    results. The prototype system continues to evolve. Some improvements to the first

    version will be described in the discussion section. The interface of the prototype version

    that is presented in this website still contains interaction steps that eventually will be

    hidden from the end users. However, at this point the exposure of these steps serve to

    closely illustrate each of the processes that occur in the system when processing a query

    and overlay request.

    On the interface, the user is presented with a screen divided into two areas, on the

    left, an area to input queries and present instructions to the users, and a map display area

    on the right (see Figure 4). To input queries the user uses drop-down boxes to choose

    values (or a range of values) for each of the environmental factors (e.g. soils, elevation,temperature) contained in the spatial database. After selecting the desired values for

    each parameter, the user hits the submit query button. The map display window is built

    http://206.168.217.254/guanajuato/http://206.168.217.254/guanajuato/
  • 8/13/2019 Transactions In Gis

    9/20

    Web-Based Spatial Information Solutions

    455

    Blackwell Publishing Ltd. 2003

    using MapServer (http://mapserver.gis.umn.edu/). In it the user selects which layers

    he or she wants to display, including the features in each layer that were selected in the

    query. The map must be refreshed every time the type or number of layers displayed is

    changed. Visually it can be determined if there is an area where all the selected features

    intersect. If this is the case, the next step is to write GML documents that represent theselected features for each layer. Finally, the intersection of these selected areas is calcu-

    lated and the resulting area is output as an SVG file and as a GML file. The SVG file can

    be displayed by itself or with other layers using the SVGeoprocessor application. The

    SVGeoprocessor application takes the image generated by MapServer as background

    and then overlays the intersection area in SVG format. The SVG interactivity capabilities

    are used to respond to users actions as described in STEP 7. All of the following described

    processing takes place on the server side of the system. The client side is only used to

    present the user with input forms and graphical output resulting from his requests.

    Figure 1 presents a flow diagram of the steps required to respond to a users request

    for queries on the thematic layers and overlay intersect geo-processing. In the description

    of the steps that follow we will explain this diagram in detail.

    STEP 1:

    In this step, (see Figure 1) the layers for the pilot area that were originally

    in ESRIs (Environmental Research Systems Institute, Redlands, California) shapefile format

    are converted to tables in the PostgreSQL RDBMS. This conversion is achieved using the

    shp2pgsql utility included as part of the PostGIS extension. This utility takes a shapefile

    and outputs a series of SQL statements (e.g. CREATE TABLE and INSERT) to create a

    table in the PostgreSQL RDBMS (Figure 2). The resulting table contains all the attributes

    of the shapefile including the coordinates that define each feature These SQL statements

    are then executed in PostgreSQL to create a table that represents the shapefile (Figure 3).

    The shp2pgsql utility allows for the selection of a projection for the data in the resulting

    table. PostGIS contains a file with close to 1800 projection definitions to choose from.

    This process was repeated for each of the layers provided (soils, elevation, and the

    climate layers).

    STEP 2:

    In the query building interface (an HTML form; see Figure 4) the user queries

    the database for the parameters of interest for each layer. When the system finishes

    processing the intersect query, the area where all the requested environmental parameters

    intersect is displayed on a SVG map. If the requested environmental parameters do not

    intersect a message is sent to the user. The user also has the option to invoke the overlay

    processor interface (Figure 5). This HTML form informs the user first about the number

    of features in each layer that meet the requested parameters, and second the number of

    selected features whose bounding boxes intersect the bounding boxes of the selected

    features in a second layer. By analyzing these numbers, the user should be able to: (1) see

    how many features satisfy the specified parameters for each layer, (2) identify which is the

    most limiting environmental factor in his or her query based on the number of selected

    features, and (3) identify which layers do not intersect. With this information the user can

    perform sensitivity analyses by changing the requested parameters for one or more layers.

    STEPS 3 and 4:

    The parameters entered by the user are sent to the server using

    the HTML form. In the server a PHP script converts the users input for each layer into a

    SQL statement (e.g. SELECT all FROM elevation WHERE elevation

    1500 AND ele-

    vation

    3000) that is fed to the PostgreSQL-PostGIS RDMS. PostgreSQL-PostGIS is

    invoked from the PHP script to execute the SQL statement which returns a string ofcoordinates describing the features selected in the layer (together with any of its attributes

    requested). This string is parsed in another PHP script to create a GML polygon and/or

    http://mapserver.gis.umn.edu/http://mapserver.gis.umn.edu/
  • 8/13/2019 Transactions In Gis

    10/20

    456

    G Anderson and R Moreno-Sanchez

    Blackwell Publishing Ltd. 2003

    multipolygon entity in a GML document that now will represent the selected features and

    attribute (Figure 6).

    STEP 5:

    The resulting GML documents corresponding to the features selected in each

    layer (e.g. SoilsSelected.xml, ElevationSelected.xml, etc.), are input into a Java programthat computes the intersection of the features (one pair at the time). This program defines

    a Java class (GMLoverlay.class) that uses the SAX parsing method to search each GML

    Figure 1 Flow diagram of the processes taking place in the prototype web-based spatial

    information system

  • 8/13/2019 Transactions In Gis

    11/20

    Web-Based Spatial Information Solutions

    457

    Blackwell Publishing Ltd. 2003

    document for the geometries and writes each polygon to an array of Java2D area objects.

    These objects are then intersected using the area intersect function contained in theJava2D API (Figure 7). The intersection result is then output as a GML document

    (intersection.xml).

    Figure 2 SQL statements that are output from the shp2pgsql utility in PostGIS extension toconvert Shapefiles to tables in the PostgreSQL RDBMS

    Figure 3 This SQL command SELECT displays all the records (that represent features) in

    the table simpleshape (that represents a layer). For illustration purposes we are showing a

    layer with a single feature

  • 8/13/2019 Transactions In Gis

    12/20

    458

    G Anderson and R Moreno-Sanchez

    Blackwell Publishing Ltd. 2003

    STEP 6:

    A XSLT Style Sheet (svg.xsl; see Figure 8) was created to transform the GML

    code contained in the intersection.xml file to SVG graphics for display as a map. Again,

    Style Sheets are custom-made XML instructions for formatting XML files. By creating

    other XSLT Style Sheets, the GML output (intersection.xml) can be potentially formatted

    into any text-based format, for example, any of the existing XML schemas for over forty

    different areas of expertise, HTML, delimited text, or UNGENERATE Arc/Info format.

    STEP 7:

    For convenience and due to the short development cycle for the prototype,

    we used MapServer to provide the map layout interface (frames, legend, scale bar and

    zoom levels) and for rendering the individual layers contained in the spatial database.

    The graphics generated by MapServer are then used as background for the display of

    the SVG map representing the intersection result. We used the SVG interactivity capab-

    ilities as follows: (1) the onmouseover event is handled to report the real-world coordin-

    ates of the mouse position on the SVG map, and (2) the onclick event is handled to

    send a query to the PostgreSQL-PostGIS database that returns the values for each of

    the layers at the point where the mouse is clicked (the equivalent of spearing through

    the layers and pulling out the values for each layer). Plate 1 shows the SVG map that the

    user gets as response to his query and geo-processing request.

    5 Discussion

    We demonstrated how spatial information in a proprietary GIS format can be converted

    to tables and managed in a RDBMS environment. This simple transformation makes the

    information contained in GIS layers easier to combine and process with information

    contained in other DBMS applications (such as accounting and inventory systems) that

    might be part of the organizations IT infrastructure. In the case of INIFAP, after this

    transformation, geographic information (such as the extent of a feature of interest, ordistance between two features) can be combined and analyzed in a RDBMS environ-

    ment (without having to link to external GIS systems) with information about crops

    Figure 4 Sample HTML form

  • 8/13/2019 Transactions In Gis

    13/20

    Web-Based Spatial Information Solutions

    459

    Blackwell Publishing Ltd. 2003

    yields, rural census data, or results of fertilization experiments contained in other

    RDBMS applications. Of course the existing RDBMS do not (and probably never will)

    have all the spatial analytical capabilities of a GIS system. However, they are constantly

    evolving and in the near future they will have enough capabilities to satisfy many simple

    geo-processing requirements. For example, currently PostgreSQL-PostGIS is capable of

    performing over sixty spatially related operations such as finding the extent of a feature

    or group of features, distance in projected units between two features, selection of

    features to the left or right of a feature, and intersection of two feature extents. There

    are plans in the near future to add topological operators to the PostGIS module includ-

    ing: touches, contains, overlaps, buffer, union and difference. Hence in the future theBoolean intersect overlay operation we implemented in the prototype system could be

    performed directly in the PostgreSQL-PostGIS RDBMS using database records.

    Figure 5 The overlay processor interface

  • 8/13/2019 Transactions In Gis

    14/20

    460

    G Anderson and R Moreno-Sanchez

    Blackwell Publishing Ltd. 2003

    We also chose to transform the geographic layers from ESRIs shapefile format to

    tables in a RDBMS system in STEP 1 where SQL queries were executed to extract desired

    features and attributes from each layer. Then using a PHP script, the strings of text

    returned by these queries were converted to GML documents that represent the selected

    spatial features and their attributes. Once the GIS layers are in GML format, they can be

    passed to any system, application or geo-processing service that is able to read this Open

    Specification. This is how the use of this OS enables a certain degree of interoperability

    between applications. These applications or services can reside on a single machine, on

    a local-area network, or on any server connected to the Internet. A packet of information(e.g. a layer or pieces of it in GML format) can be passed from application to applica-

    tion (and these could be of very different nature) adding or extracting information to or

    Figure 6 Example of GML code representing a single feature (see Figure 9) selected from

    one of the layers in the spatial database

  • 8/13/2019 Transactions In Gis

    15/20

    Web-Based Spatial Information Solutions

    461

    Blackwell Publishing Ltd. 2003

    from the original packet until the desired end result is obtained. This is how interoper-

    ability facilitates distributed processing. In other words, compliance with Open Specifica-

    tions enables interoperability between heterogeneous environments and systems, facilitates

    distributed processing and opens new possibilities for the combination and processing of

    geographic data.

    We also used a PHP script to convert the string returned as a result of the query forthe desired features (records) in the PostgreSQL-PostGIS RDBMS to a GML document

    in STEP 4. This process worked well for small data sets; however, when we started to

    Figure 7 This piece of Java code links to the Java2D API and invokes the Constructive Area

    Geometry Methods contained in this API to generate the intersection, subtraction, addition,

    and exclusiveOr of two Java2D area objects. We currently using only the CASE 3 for the

    intersect method

  • 8/13/2019 Transactions In Gis

    16/20

    462

    G Anderson and R Moreno-Sanchez

    Blackwell Publishing Ltd. 2003

    process larger areas the processing time was unacceptable (several minutes). To minimize

    the number of selected features that have to be converted to GML we used the bounding

    box intersect function within PostgreSQL-PostGIS. This function returns the records

    (features) from one table (layer) whose bounding boxes intersect the bounding boxes of

    features in a second layer. In this way only the records that fulfill the query and whose

    bounding boxes intersect with bounding boxes of features from a second layer are

    returned for conversion to GML documents. This preprocessing greatly reduces the

    number of records that must be converted to GML for performing the actual intersection

    of features in the GMLoverlay.class program, and allowed us to process larger geo-

    graphic areas. In the latest version of the prototype system we were able to significantly

    improve the response times by: (1) replacing the use of PHP scripts to control the

    processes flow by PERL CGI scripts; and (2) consolidating the conversion to SVG

    and GML formats of the overlay result (intersect.xml) into the GMLoverlay.class Java

    class. The full code for this implementation can be found at the demonstration website

    (http://206.168.217.254/guanajuato/).

    We then created a Java class (GMLoverlay.class) that implements the intersect func-

    tion that is part of the Constructive Area Geometry Methods included in the Java2D

    API in STEP 5. In the same way we could have as easily implemented any of the other

    operators that are included in the Java2D API (union, subtraction, and exclusive-OR)

    to provide these geo-processing capabilities in the system. As a matter of fact the Java code

    in Figure 7 is invoking these methods from the Java2D API, we are just not currently

    using them.In STEP 6 we demonstrated the use of XSLT Style Sheets to convert a layer in GML

    format (intersection.xml) to SVG (for graphical display). If geographic data in GML

    Figure 8 XSLT Style Sheet to transform the GML document (intersection.xml), which repres-

    ents the area where the selected environmental parameters intersect to SVG for display on

    the end users browser

    http://206.168.217.254/guanajuato/http://206.168.217.254/guanajuato/
  • 8/13/2019 Transactions In Gis

    17/20

    Web-Based Spatial Information Solutions

    463

    Blackwell Publishing Ltd. 2003

    format gains popularity, Style Sheets could be developed to transform GML documents

    to a wide array of formats (e.g. any text format, any XML schema, or GIS proprietary

    formats) used by other systems and applications. In addition, through the use of XSL-

    Formatting Objects (XSL-FO) these data could be converted to different printing formats

    for high quality output. Eventually, libraries of Style Sheets could be posted on websites,downloaded to reside locally, or invoked directly from remote servers to perform a

    transformation. This possibility would greatly increase the speed and ease with which

    geographic data is made available to a broader array of IT applications.

    6 Conclusions

    State-of-the-art web-based geo-processing solutions can be implemented using currently

    available OSS and OS. The required technology for solving spatial problems in an Internet

    based computing environment is available from at least one mature OS project. We used

    the OSS we consider to be the most powerful, widespread, accessible, easy to learn, and

    with a good level of user support in the form of software documentation, books and

    user-groups forums. Typical OSS installation involves downloading the source code for

    the target Operating System (e.g. Linux, Windows XP, NT or 2000), identifying and

    downloading other required software components, configuring the desired features and

    compiling the application. This process is straightforward and routine for most person-

    nel with general IT backgrounds, but it could be intimidating for casual users with little

    programming experience. However, most mature OSS are well supported with thorough

    installation instructions and any motivated GIS user should be able to install and start

    using the OSS presented in this article.

    Given that the purpose of the project here described was exploratory, we ended up

    using a wider array of OS and OSS than would probably have been optimally required

    to create the functionality present in the prototype system. For the same reason, we took

    more than the strictly necessary steps to produce the desired results. For the develop-

    ment of the prototype system we ended up using more than 10 Open Source technolo-

    gies. Even by using the minimum required number of OSS and OS, one of the issues

    faced when implementing advanced OSS web-based GIS solutions is the breadth of

    technical skills required and the logistics of orchestrating the interaction of many

    applications. Designing web-based GIS solutions requires a thorough understanding of

    core WWW technologies (such as the configuration and management of web servers),

    spatial information management expertise, and the ability to choreograph the geo-

    processing steps required to solve spatial problems in a distributed environment. It is

    not difficult to find these necessary skills in an IT department or in a highly motivated

    power user.

    In developing the prototype system, we learned: (1) the potential of SVG to develop

    highly interactive mapping applications on the web; (2) PostgreSQL-PostGIS is a robust

    database management system that offers a considerable, and continuously increasing,

    number of geo-processing functions; (3) Java2D can be effectively used for basic 2D vector

    overlay (a specifically geo-spatially oriented Java API such as the OS Java Topology

    Suite (http://www.vividsolutions.com/jts/jtshome.htm) is clearly preferable for more

    advanced geo-processing, but Java2D is a good starting point for performing vector analysiswith Java); and (4) MapServer proved to be an easy to use and production quality Internet

    map server.

    http://www.vividsolutions.com/jts/jtshome.htmhttp://www.vividsolutions.com/jts/jtshome.htm
  • 8/13/2019 Transactions In Gis

    18/20

    464

    G Anderson and R Moreno-Sanchez

    Blackwell Publishing Ltd. 2003

    From this experience, we can conclude that for organizations with scarce resources

    wanting to implement the distribution of their geographic data and geo-processing ser-

    vices over the WWW, the use of the OS and OSS we used offer the following advantages:

    (1) no software costs; (2) software tools that were easily learned by personnel with

    general IT background (UNIX, programming, databases design and management);

    (3) small software footprints; (4) no need to commit to a proprietary web-GIS, DBMS

    or web software with their associated costs; (5) ease of compatibility with existing IT

    infrastructure (personnel with basic databases and programming skills, existing DBMS

    software and DBMS applications); (6) flexibility to implement geo-processing capabil-

    ities currently non-existent in commercial web-GIS software (e.g. Boolean intersect

    overlays); (7) the principles to implement these technologies are straightforward and

    accessible to a broad audience of geographic information scientists and developers;

    and (8) the system developed has the potential to interoperate with other systems and

    applications that use the same OS.

    Our experience also showed that the resulting text files tend to be large when

    geographic data are converted to GML format, as illustrated by the GML code (Figure 6)

    that is required to represent a single small and geometrically simple polygon (Figure 9).

    The size of the GML files depends on the number of features and the number of points

    per feature contained in a layer (Sahay 1999 provides a formula for calculating the

    storage size of a GML document based on these two parameters). As an example, a layer

    in ESRIs shapefile format of size 342,708 bytes would occupy 599,473 bytes in its

    corresponding GML representation. In addition, because GML up to version 2.1.2 does

    not support topology, common boundaries between features must be stored twice (once

    for each feature). The size of the GML files could affect secondary storage (e.g. hard

    drive or tape) requirements, as well as, the time required to parse the file to extract

    desired information. A large implementation would greatly benefit from file compres-sion algorithms and highly efficient parsing methods. More research is required on both

    of these areas.

    Figure 9 SVG graphic of a simple feature. Figure 6 shows the GML code from which this

    image is generated

  • 8/13/2019 Transactions In Gis

    19/20

    Web-Based Spatial Information Solutions

    465

    Blackwell Publishing Ltd. 2003

    Next, we will test the scalability capacity of these technologies to create the full

    implementation for INIFAPs web-based spatial information needs. In addition, we are

    planning to make a full evaluation of the reliability, performance, security, and total

    costs of ownership for the system. We also need to make the overlay processor interface

    more user friendly and improve its capacity to explain the intersection results obtained.Finally, so far we have dealt only with geographic data in vector format, and we are

    currently working on developing web-based geo-processing capabilities for raster data.

    Acknowledgements

    The authors would like to thank the National Institute for Forest, Agriculture and

    Livestock Research (INIFAP) Central Region and especially Dr. Hilario Garcia-Nieto

    for their support and cooperation in the development of this project. We would also like

    to thank the anonymous reviewers for their helpful comments and suggestions for

    improvement of an earlier draft of this manuscript.

    References

    Aloisio G, Milillo G, and Williams R D 1999 An XML architecture for high-performance web-basedanalysis of remote-sensing archives. Future Generation Computer Systems

    16: 91100Boumphrey F, Direnzo O, Duckett J, Graf J, Houle P, Hollander D, Jenkins T, Jones P, Kingsley-

    Hughes A, Kingsley-Hughes K, McQueen C, and Mohr S 1998 XML Applications.

    Birmingham,Wrox Press

    Dangermond J 2002 Web services and GIS. Geospatial Solutions

    12(7): 56Ducket J, Griffin O, Mohr S, Norton F, Stokes-Rees I, Williams K, Kurt Cagle, Nikola O, and

    Tennison J 2001 Professional XML Schemas

    . Birmingham, Wrox PressEisenberg J D 2002 SVG Essentials

    . Sebastopol, CA, OReilly & AssociatesGardels K 1997 Open GIS and on-line environmental libraries. SIGMOD Record

    26: 328Gould M and Ribalaygua A 1999 A new breed of web-enabled graphics. GeoWorld 12(3):

    469Harvey F 1999 Designing for interoperability: Overcoming semantic differences. In Goodchild M F,

    Egenhofer M, Fegeas R and Kottman C (eds) Interoperating Geographic Information Systems

    .Boston, MA, Kluwer: 8597

    Hecht L 2002a Get your free interoperability roadmap. GeoWorld

    15(2): 223Hecht L 2002b Insist on interoperability. GeoWorld

    15(4): 223

    Hecht L 2002c Web services are the future of geo-processing. GeoWorld

    15(6): 234Herring J 1999 The OpenGIS data model. Photogrammetric Engineering and Remote Sensing 65:

    5858Holland D 2001 Delivering the digital national framework in GML. GeoEurope 10(8): 29 30Kay M 2001 XLST Programmers Reference

    (Second edition). Birmingham, Wrox PressKottman C 1999 The Open GIS Consortium and progress toward interoperability in GIS. In

    Goodchild M F, Egenhofer M, Fegeas R and Kottman C (eds) Interoperating GeographicInformation Systems

    . Boston, MA, Kluwer: 3954Lake R 2001a GML 2.0 enabling the geospatial web. Geospatial Solutions

    11(7): 38 41Lake R 2001b GML lays the foundation for the geospatial web. GeoWorld

    14(10): 425Lowe J 2002 Spatial on a shoestring: Leveraging free Open Source Software. Geospatial Solutions

    12(6): 425

    McKee L 1998 What does OpenGIS Specification conformance mean? GeoWorld

    11(8): 38Neumann A 2002 Comparing .SWF (Shockwave Flash) and .svg (Scalable Vector Graphics) fileformat specifications. In Proceedings of the

    SVG Open Developers Conference

    , 1517 July 2002,Zurich Switzerland (available at http://www.carto.net/papers/svg/comparison_flash_svg.html)

    http://www.carto.net/papers/svg/comparison_%EF%AC%82ash_http://www.carto.net/papers/svg/comparison_%EF%AC%82ash_
  • 8/13/2019 Transactions In Gis

    20/20

    466

    G Anderson and R Moreno-Sanchez

    Bl k ll P bli hi Ltd 2003

    Puhretmair F and Woss W 2001 XML-Based integration of GIS and heterogeneous tourism informa-tion. In Name (ed) Title. Berlin, Springer-Verlag Lectures Notes in Computer Science No2068: 346 58

    Ramsey P 2002 Open source GIS fights the three-horned monster. GeoWorld

    15(8): 23 5Rasmus L 2000 PHP Pocket Reference.

    Sebastopol, CA, OReilly & Associates

    SahayN 2002 GMLView: A GML Map Renderer. Unpublished M.S. Technical Paper, Departmentof Computer Engineering, University of Minnesota (available at http://www-users.cs.umn.edu/~sahay/8701/planb_1_10.htm)

    Seth A P 1999 Changing focus on interoperability in information systems: From system, syntax,structure to semantics. In Goodchild M F, Egenhofer M, Fegeas R and Kottman C (eds)Interoperating Geographic Information Systems. Boston, MA, Kluwer: 530

    Stinson B 2001 PostgreSQL Essential Reference. Indianapolis, IN, New RidersStones R and Matthew N 2001 Beginning Databases with PostgreSQL.Chicago, IL, Wrox PressVckovski A 1998 Interoperable and Distributed Processing in GIS. Bristol, PA, Taylor and FrancisWheeler D A 2002 Why Open Source Software/Free Software (OSS/FS)? Look at the Numbers!

    WWW document, http://www.dwheeler.com/oss_fs_why.html Waters N 1999 Is XML the answer to internet-based GIS? GeoWorld 12(7): 32 3

    http://www-users.cs.umn/http://www.dwheeler.com/oss_fs_why.htmlhttp://www.dwheeler.com/oss_fs_why.htmlhttp://www-users.cs.umn/