Cataloging digital geographic data in the information infrastructure: A literature and technology...

20
Pergamon information Processing & Management, vol. 30, No. 5, pp. 587-606, 1994 Copyrrght 0 1994 Elsevier Science Ltd Printed in Great B&am. All rights reserved 0306-4573/94 $6.00 -I- .OO 0306-4573(94)E0003-K T~E~~~ IN. . . A CRfTICAL REVIEW CATALOGING DIGITAL GEOGRAPHIC DATA IN THE INFORMATION INFRASTRUCTURE: A LITERATURE AND TECHNOLOGY REVIEW STEVEN FRANK Department of Surveying Engineering/NCGIA, 131 Boardman Hall, University of Maine, Orono, ME 04469-5711, U.S.A. (Recerued 15 August 1993; accepted in final form 20 December 1993 ) Abstract- Advances in computer and telecommunications technology have brought on a new era in the Information Age resulting both in the proliferation of digital geographic data and the emergence of an information infrastructure to facilitate the transfer of electronic information. We need to plan for the integration of digital geographic data into the information infrastructure. Those using the information infrastructure will require guides to access the information contained therein efficiently and effectively. Cataloging has long been recognized as an efficient method for condensing knowledge of large collections of items. Special approaches have proven successfui for many appli- cations, including digital geographic data. The advent of interconnecting catalogs to cre- ate vast virtual libraries of information leads to a need to overcome differences in special systems yet still allow users to access many of the special features found in individual systems. This article looks at the background of various cataloging approaches, at the needs of digital geographic data users, at trends in automated cataloging, and at trends in organizing resources in an electronic network environment. 1. BACKGROUND The shifting of information from paper-based to electronic-based systems allows us to produce and disseminate this information faster than ever before. This, in turn, allows us to produce even more information, thus creating a greater store of information to choose from. Methods of finding this information-such as full text search and retrieval-that were once considered too clumsy or too labor intensive to be used with paper-based systems are becoming feasible with electronic systems. Interconnected electronic information sys- tems allow users to search for and retrieve information at remote sites. Differences between interconnected electronic information systems are causing prob- lems that become more acute as we continue to automate many of the functions of these systems. Differences in the way the information is stored make it difficult for all but the most seasoned information specialists to retrieve similar information from various sources. Differences in the kinds of information held by various systems make it equally difficult to determine which information sources are appropriate for a particular need. This article Iooks at some of the problems of finding electronic information in the information infrastructure, with specia1 emphasis on finding digital spatial (geographic) information. We look at the literature describing the spatial information cataloging sys- tems and the problems of spatial information cataloging. We also look at the literature describing the technology of the information infrastructure, concentrating on the Internet and how it can be used to find and access information. Our review, while not exhaustive, points to some of the difficulties facing users trying to find spatial information in the in- formation infrastructure. Section 1 continues to develop some of the background of the information infrastructure and the place of spatial information in that infrastructure. Sec- tion 2 looks at the traditional approaches to finding information that, despite being trans- 587

Transcript of Cataloging digital geographic data in the information infrastructure: A literature and technology...

Pergamon

information Processing & Management, vol. 30, No. 5, pp. 587-606, 1994 Copyrrght 0 1994 Elsevier Science Ltd

Printed in Great B&am. All rights reserved 0306-4573/94 $6.00 -I- .OO

0306-4573(94)E0003-K

T~E~~~ IN. . . A CRfTICAL REVIEW

CATALOGING DIGITAL GEOGRAPHIC DATA IN THE INFORMATION INFRASTRUCTURE:

A LITERATURE AND TECHNOLOGY REVIEW

STEVEN FRANK

Department of Surveying Engineering/NCGIA, 131 Boardman Hall, University of Maine, Orono, ME 04469-5711, U.S.A.

(Recerued 15 August 1993; accepted in final form 20 December 1993 )

Abstract- Advances in computer and telecommunications technology have brought on a new era in the Information Age resulting both in the proliferation of digital geographic data and the emergence of an information infrastructure to facilitate the transfer of electronic information. We need to plan for the integration of digital geographic data into the information infrastructure. Those using the information infrastructure will require guides to access the information contained therein efficiently and effectively. Cataloging has long been recognized as an efficient method for condensing knowledge of large collections of items. Special approaches have proven successfui for many appli- cations, including digital geographic data. The advent of interconnecting catalogs to cre- ate vast virtual libraries of information leads to a need to overcome differences in special systems yet still allow users to access many of the special features found in individual systems. This article looks at the background of various cataloging approaches, at the needs of digital geographic data users, at trends in automated cataloging, and at trends in organizing resources in an electronic network environment.

1. BACKGROUND

The shifting of information from paper-based to electronic-based systems allows us to produce and disseminate this information faster than ever before. This, in turn, allows us to produce even more information, thus creating a greater store of information to choose from. Methods of finding this information-such as full text search and retrieval-that were once considered too clumsy or too labor intensive to be used with paper-based systems are becoming feasible with electronic systems. Interconnected electronic information sys- tems allow users to search for and retrieve information at remote sites.

Differences between interconnected electronic information systems are causing prob- lems that become more acute as we continue to automate many of the functions of these systems. Differences in the way the information is stored make it difficult for all but the most seasoned information specialists to retrieve similar information from various sources. Differences in the kinds of information held by various systems make it equally difficult to determine which information sources are appropriate for a particular need.

This article Iooks at some of the problems of finding electronic information in the information infrastructure, with specia1 emphasis on finding digital spatial (geographic) information. We look at the literature describing the spatial information cataloging sys- tems and the problems of spatial information cataloging. We also look at the literature describing the technology of the information infrastructure, concentrating on the Internet and how it can be used to find and access information. Our review, while not exhaustive, points to some of the difficulties facing users trying to find spatial information in the in- formation infrastructure. Section 1 continues to develop some of the background of the information infrastructure and the place of spatial information in that infrastructure. Sec- tion 2 looks at the traditional approaches to finding information that, despite being trans-

587

588 S. FRANK

ferred to electronic-based systems, are still largely based on the ways we handle information on paper. Section 3 discusses many of the new information finding approaches that are being developed specifically for the Internet. Section 4 contains our conclusions.

1.1 The information infrastructure Advances in computer and telecommunications technologies are creating an informa-

tion infrastructure in which vast amounts of information are being stored in electronic form on computers connected over complex telecommunications links that allow users around the world to exchange this information. These new “information highways” are expected to have a catalytic effect on our society, industry, and universities similar to advent of the telephone in the early twentieth century (Burnhill, 1991). However, as the amount of this information increases, users must cope with torrents of information flowing from numer- ous sources but have few means of determining if information they need does exist and, if so, where to find it. “The problem promises to become more acute” (Gould & Pierce, 1991, p. 42).

Increasing amounts of this electronic information are in the form of geographic data- sets. Used for digital mapping and geographic information systems (GE), spatial data in electronic form is hailed for its ability to be quickly and easily transferred from user to user and from application to application. Applications of digital spatial data have begun to filter down to the average American home through personal computer software such as PCGlobe, a program containing digital data in the form of world maps, and AutoMap, a program for planning automobile trips.

1.2 Geographic (spatial) data Spatial data, in digital or analog form, represent models of the “real” world. While

there are many kinds of “real” worlds represented by spatial data, ranging from molecu- lar models to digital X-rays to CAD/CAM engineering designs, we concern ourselves here with geographic data-data that can be referenced to the earth’s surface. Many geographic datasets are massive in volume and contain complex interrelations of myriad objects on or near the earth’s surface. Changes in the real world, whether social or physical, require changes in the datasets comprising the models. New data are sometimes difficult and often expensive to collect (Federal Interagency Coordinating Committee on Digital Cartography [FICCDC], 1989; Frank, 1992). Spatial data in digital formats allow greater sharing of new data and thus a broader base through which to distribute data gathering costs.

Digital spatial data are valued for the uses to which they can be put (Lucas & Rose, 1991; Onsrud & Rushton, 1992). They have a longer lifespan than many types of data and often outlive both the hardware and software for which they were originally designed (Onsrud & Rushton, 1992; Thapa & Bossler, 1992). Nor does time diminish the value of old observation data (Gould & Pierce, 1991). Such characteristics differentiate digital spatial data from many other forms of information and point to the importance of includ- ing digital spatial data in the information infrastructure.

“The fundamental problems with sharing data between organizational units are related to the philosophical problem of how we express ‘meaning’ ” (Frank, 1992, p. 11). Data may impart one meaning in one system yet may mean something else in another system. This is due to our inability to create the “perfect” spatial system applicable to all users; thus dif- ferent systems support different aspects of spatial information (McAbee, 1992; Pascoe & Penny, 1990).

Digital spatial data for geographic uses can be represented in either raster or vector format. Raster format data divides space into regular intervals of area (cells) and assigns values to each area. For example, a cell in a raster data set that contained 60% land and 40% water would be assigned a value of “land.” Cells can have many values assigned to them, but never more than one value from the same class. Data collected from remote sensing platforms is in raster format. Vector format data delineates features by the use of points, lines, and areas and assigns values to these elements. For example, a line could have a value of “river,” “ road,” or “property line.” Vector elements can have more than one value assigned from the same class. For example, a river could also be a property line. Digital

Cataloging digital geographic data 589

spatial data may also be linked to nonspatial data sets, such as lists of property owners, that may be analyzed with or without the corresponding spatial data. All attempts to define intermediate forms that might support information levels of various systems must be quite complex (Pascoe & Penny, 1990) and will be subject to much change over time as spatial technology evolves and new levels of spatial information are created.

1.3 Spatial metadata Spatial metadata has been described as (1) digital information that allows the poten-

tial user to understand that data’s fitness for use; and (2) information on database con- tents, database schema, its source and history, and its quality (Onsrud & Rushton, 1992, p. 12). Spatial metadata for cataloging spatial datasets and databases must allow for users to determine if, among other things, the data are suitable to be integrated with other data (Flowerdew, 1991). We must remember, however, that such data are not merely a collec- tion of computer maps, but are models of the real world (Al-Taha & Frank, 1992). Since spatial data can represent many complex concepts and interrelationships, metadata de- scribing the data should somehow be able to convey to the data searcher the concepts and relationships represented in the data. Issues of “data quality” can be subjective according to the analyses to be done with the data. For example, the “quality” of information on soils will differ radically depending if the information was intended for agricultural surveys, where concern is for the nutrient content of the soil; for structural foundation analyses, where concern is for the load-bearing capacity of the soil; and for geological studies, where the concern is for the historical formation of the soil.

Some see the issues of spatial metadata as overlapping and extending from the tradi- tional requirements of bibliographic cataloging (Burnhill, 1991; Lai & Gilhes, 1991; Lilly- white, 1991; Newman, 1991). Among the problems faced are the diversity of information needed, the amounts of “irrelevant” information users will accept, the ability of catalogers to cross reference information, and the fact that not everyone understands the mathemat- ical formulations, such as grid systems, that underlie most spatial data (Medyckyj-Scott, 1991; Newman, 1991). While systems can link sets of geometric coordinates to specific place names, differing concepts of what the actual spatial extents of a named place are can create problems for users that do not share the same concepts.

Spatial metadata issues also overlap with the data dictionaries and data directories used to design, monitor, and Iocate data in information systems. Efforts are underway at the national level to define minimum content standards for spatial metadata used for spatial data transfers and cataloging (Federal Geographic Data Committee [FGDC], 1992). How- ever, such standards appear to be an ad hoc solution suited, at best, for the short term. More concerted future efforts in cooperation with other information disciplines will be needed to ensure that any spatial information infrastructure developed can eventually be fully integrated into a generic information infrastructure.

1.4 Integration issues for spatial data Finding and evaluating spatial data are key issues for spatial data users in an infor-

mation infrastructure. In some cases, users may merely desire certain information, such as who owns a parcel of land, this must be satisfied by access to a database containing such information. In other cases, users may want to do complex studies and wish to see if datasets suitable for such studies exist. It may well be that such data exist scattered across multiple datasets, which then must be evaluated for compatibility with one another. How these datasets are described and how both the descriptions and the datasets themselves are accessed play key roles in how users find and evaluate data.

2. CATALOGING

{IImagine what it would be like when trying to find something in the libraries and databases of the world, where organization was done by someone else who had no idea of what my needs were. Chaos. Sheer chaos. (Norman, 1988, p. 215)

590 S. FRANK

As we collect items, we tend to try to organize them to keep track of what and where they are. The simplest method is to build a list, or index, of the items. The list can be in random order, but as the list grows it becomes necessary to order, or classify, the list by some means to produce an index. Alphabetical classification is the method familiar to most users. But the index by itself only tells the user that the item exists, not where to find it. To find where the item exists, it is necessary to add this information to the index and create a directory. Directories tell the user (1) that the item exists, and (2) where to find the item. A directory is thus an index that names an item and points to the location of that item. The user of the directory must still remember the unique qualities of the item being searched for. As the collection of items continues to grow, the uniqueness of each item diminishes as well as the user’s ability to remember the distinctions between items. It be- comes necessary to add more information to the directory to distinguish apparently simi- lar items from one another. This is done by adding descriptive information about each item sufficient to make it unique in the collection (Hufford, 1991). A catalog entry thus serves as a surrogate for the item of interest (Weibel, 1992). Technically, the term catalog is a list of descriptions of items that are found in a single collection, although this distinc- tion is becoming lost in the age of computer databases (Hagler, 1991). The distinction between catalogs and indices or directories, then, is that indices and directories “are guides rather than surrogates . . . intended to guide searchers to, rather than inform them about, the contents of a work” (Borko & Bernier, 1978, p. 11). In evaluating means for access- ing digital spatial data, it is useful to consider information about digital spatial data, or spatial metadata (Onsrud & Rushton, 1992), in the context of the distinctions made by library and information science literature among indices, directories, and catalogs.

2.1 Bibliographic cataloging Cataloging has long been recognized, particularly in the library community, as a

method for describing a certain item and giving the location of that item. As early as 1850, and perhaps far earlier, the importance of full and accurate cataloging for managing large collections of items has been noted. The first edition of Cutter’s Rules for a Printed Dictionary Catalogue was published by the United States Bureau of Education in 1876 (Hufford, 1991). “Descriptive cataloging is the subset of cataloging activity which involves (1) providing a bibliographic description of an item sufficient to identify the item and to provide to a prospective user certain information necessary to make judgments about its usefulness and (2) formulating uniform access points to enable the potential user to retrieve the bibliographic record” (Fenly, 1992, p. 56). Cataloging principles of adequate identi- fication and consistency have developed into models and rules that are used to compile catalogs. The principles attempt to ensure that (1) separate items do not become confused with one another; (2) the description allow access points at junctures relevant to users of the catalog; and (3) the descriptions are presented in a uniform way so that they may be interpreted unambiguously (Hagler, 1991). Traditionally, library catalogs have served to inform users “whether a library has a certain item” and to allow users to locate the item within the library (Hufford, 1991, p. 35). In the library setting, the user often searched through files of index cards seeking items of interest. With the advent of computerized card indexes and information retrieval systems, a transformation of the way we seek informa- tion has begun to occur (Hirshon, 1993).

Catalog entries can consist of abstracts of the item cited. Abstracts give a broad out- line of the contents of the item, thus giving the user a sense of the item’s “fitness for use” for a given purpose. Another approach is to attempt to measure or gauge the utility of the retrieved item to the requester in order to determine which retrieval elements are best suited to describe the “fitness for use” (Buckland, 1988). Such information may not always be determinable from inspection of the item itself, such as the scale and projection of digital map files. This is especially critical with scientific and spatial data, where applications of the data are severely constrained by how the data were collected and processed. Informa- tion such as the size of an item, the cost of the item, where the item might be purchased, and where the item is published also may be included in the catalog description. Inter- national standards for cataloging have been developed as the International Standard Bib-

Cataloging digital geographic data 591

liographic Description (ISBD) and encoded in the Anglo American Cataloging Rufes, second edition, (AACR2) (Hagler, 1991; Hufford, 1991).

2.2 Bibliographic retrieval systems

Bibliographic retrieval systems link users to documents by matching document surro- gates (titles, abstracts, indexing terms, etc.) to the searcher’s queries. [Rletrieval effec- tiveness . . . largely depends on the ability of the surrogates to represent their source documents accurately and thoroughly while distinguishing them from a multitude of other items. (Tibbo, 1992, p. 33)

The Machine-Readable Cataloging (MARC) format was developed by the Library of Congress and the American Library Association’s Machine-Readable Bibliographic infor- mation (MARBI) Committee to allow the paradigm of manually searching index cards to be transferred to an electronic computer environment. It has grown to become an inter- national bibliographic format standard used to transfer machine-readable records between library databases. It is not actually a single format, but rather a closely linked series of formats, each developed under the auspices of national libraries of participating countries. The version used by libraries in the United States, known as USMARC or LCMARC, was developed using the National Information Standards Organization (NISO) ANSI 239 standards and is compatible with AACR2 (Crawford, 1989; Hagler, 1991; Hunter, 1985). A universal MARC format (UNIMARC) is used to transIate between different national ver- sions. The MARC format is a variable length field format with tags for identifying each field (Crawford, 1989; Hunter, 1985). While “not every library using the MARC format for machine readable records fills in every possible processing or searching code,” the records can be interchanged between libraries and between bibliographic retrieval systems even if incomplete (Hagler, 1991, p. 37).

The MARC format has been criticized for some shortcomings relevant to digital spa- tial data. Ercegovac and Borko argue that use of the MARC format “requires cataloger’s memorization of a large number of rules, which change frequently” (Ercegovac & Borko, 1992b, p. 267). Lynch (1993b) notes that many MARC “variants” among different inter- est groups have led to some problems in exchanging bibliographic data. And while the MARC record format does contain fields for identifying spatial attributes (Library of Congress, 1988), such as map scale or map bounding coordinates, the use of such fields is inconsistent. There is word that the library community is attempting to fix these prob- lems. However, many current bibliographic databases which accept MARC-formatted data do not allow users to search for information based on those spatial attribute fields.

It should be noted, however, that despite any possible shortcomings of MARC, it has proven very successful for integrating bibliographic data in the library community. It is the “most commoniy used bibliographic format for computer processing in the United States” (Crawford, 1989, p_ 3).

Rivera and Wood (1992) report that a survey of 142 libraries in Canada and the United States revealed that 46% of these libraries now hold digital spatial data in some form. It appears that these libraries are cataloging digital spatial data sets as Machine-Readable Data Files (MRDFs) (Larsgaard, 1992). Digital spatial data will be increasingly sent to U.S. libraries on deposit as federal agencies continue to switch from paper to electronic information (Rivera & Wood, 1992). “One working assumption is that massive amounts of human effort will not be available to catalog these resources on an ongoing basis” (Lynch & Preston, 1992, p. 16).

2.3 Geographic data cataloging “Geographic/cartographic [spatial] applications are characterized by massive volumes

of data, both spatial and non-spatial, as well as the need to record their evolution in time” (Guenther & Buchman, 1990, p. 62). Because of the voluminous nature of geographic data, they are prime candidates for cataloging-providing users with much smaller surrogates useful for evaluating the data. Two items are of special interest when cataloging geographic

592 S. FRANK

data: (1) what to include in each catalog description, and (2) how to index the catalog entries. Geographic data indexing can be done thematically (by subject), spatially (by location), or temporally (Ruggles & Newman, 1991; Walker et a/., 1992). Geographic data are collected at many levels of resolution, at widely varying frequencies, and are sensitive to both the methodology used to collect the data and the methodology used to process the data. For example, the Landsat 5 satellite collects earth resource data across seven sensing bands at a ground resolution of 30 meters, while the SPOT satellites collect resource data across three sensing bands at a ground resolution of 10 meters (Aronoff, 1989). The Landsat 5 images provide greater information depth than the SPOT imagery, but SPOT provides a more precise ground resolution of that information. Similar infor- mational trade-offs are made for spatial data collected from aerial photography, grounds surveys, and other spatial information collection systems.

Additionally, many geographic phenomena fluctuate over time or may be represented as “accumulations of observations or averages over time” (Flowerdew, 1991, p. 381). Dig- ital geographic data must also be capable of being processed efficiently by computer appli- cations. Applications must be able to extract from the data information that users request. The world is a very complex place. There is no application that can extract all the infor- mation users may wish to obtain from a dataset, nor is there a dataset that can contain all the information users may wish to extract. This means that the user must know what information can be extracted from the data and the level of confidence with which he or she may use that information.

Systematic information about data. . . goes beyond the descriptive fields used by the library world . . . we should distinguish between metadata fields that equate to descrip- tive cataloguing, and those that equate to classification or to indexing. Both are required in order to allow effective access to information. (Burnhill, 1991, p. 16)

Indexing methodology plays a crucial role in how users may search for items. Lars- gaard (1978) identifies several methods of providing geographic coding (or “geocoding”) for spatial representation. These are

1. Using nominal values that do not indicate spatial relationships among entities (place names);

2. Using unique designations for undefined or implicitly defined locations, such as zip

codes; 3. Using ordinal values to indicate relative positions of spatial units within some

defined system, such as census tracts; and 4. Using explicit boundary delineations in mathematical form, such as state plane

coordinate values or latitude and longitude values.

Within each method of geocoding there also exist multiple implementations. For example, place names will differ among various languages or even within the same language accord- ing to regional dialects. Numerous ellipsoidal models of the earth have been developed to represent geographic coordinate values along with many local plane coordinate systems that attempt to “flatten” portions of the earth’s surface. All coordinate-based implementations can be translated to other coordinate-based implementation, although there is sometimes a loss of positional accuracy. Coordinate values can also be translated to nominal values, although such translations can often be extremely subjective in nature.

This becomes critical even in a computer environment. As noted earlier, text-based formats do not allow many forms of spatial indexing-the ability to retrieve information based on spatial coordinate or other non-place-name values-and thus users lack the abil- ity to perform certain kinds of precise spatially based searches. Several alphabetic, alpha- numeric, and numeric systems have been tried for cataloging spatial data (Larsgaard, 1978), but the most successful approach has been the use of index maps-maps that geographi- cally index the existence of cataloged materials by showing those materials as areas or features on the index map. Some maps, such as the U.S. Geological Survev’s topographic

Cataloging digital geographic data 593

quadrangle series, have printed index maps. Users may scan the index map, determine the area of interest, note the map names or numbers listed on the index map which are needed for closer examination, and then retrieve those maps. Many agencies, companies, and libraries dealing with spatial information have produced custom, hand-labeled index maps (Holmes, 1990). Digital spatial data systems seem to be developing indexing systems based on “tiling” schemes that allow the user to zoom up or down through the use of tiles, or computer representations of map sheets (Carpenter, 1992; Medyckyj-Scott, 1991; Vrana, 1992; Walker et al., 1992). However, even with the advent of computer technology, the pro- cess of searching indexes and then viewing data can be a tedious, iterative process (Rubin, 1992). In the case of digital spatial data, one may also request that if an appropriate data- set is found, that it also be in a format suitable to be displayed by a particular application process in order to assess further the possible utility of that data. However, even if one can view the data, many “fitness for use” characteristics, such as scale, completeness, or author, will be neither explicitly nor implicitly derivable without the use of spatial metadata.

2.4 Geographic data retrieval systems Many systems have been devised especially to handle spatial data, such as the U.S.

Geological Survey’s (USGS) Geodex system (Larsgaard, 1992), the Research Libraries Group’s Georeference Information Network (Carver, 1989; RLG, 1989), and the commer- cial Global Data Catalog (Global, 1992), which allow spatially based searches. However, many new systems, both text based and spatially based, “have never gone into production because of reliability problems” (Fenly, 1992, p. 52). Users seem to tolerate certain small amounts of incorrect or nonoptimal information but must have faith that the system will reliably find all of the information requested (Fenly, 1992). However, some users seem to infer a mystical intelligence in retrieval systems (Hirschheim & Newman, 1991) and may become frustrated because the system fails to respond as they think it should. Addition- ally, many systems are designed primarily for the needs of certain users of spatial data and do not appear to have the generality to accommodate the needs of other possible users of spatial data. Nor does it seem likely that a single system that would satisfy all users of spatial data will, or should, come along in the near future. Some of the systems used for cataloging digital spatial data are briefly described next.

The Louisiana Coast Geographic Information System Network (LCGISN) is an effort by the Louisiana Geological Survey, the U.S. Geological Survey, and Louisiana State Uni- versity to create a system for referencing and accessing spatial databases containing coastal information in the state of Louisiana. The project seeks to identify the most important existing databases and incorporate them into a network of GIS sources (McBride et al., 1991). The spatial index/bibliography will link geographic location to maps, imagery, photographs, or names. Bibliographic references will also be linked to coordinates or place names where possible (McBride et al., 1991). Spatial indexing may be done using a tiling scheme based on quarters of USGS 7.5minute quadrangle maps. Where possible, items will be cross referenced to materials using the USMARC format (Carpenter, 1992). Users can interactively define areas of interest on map displays as well as enter textual criteria such as date, author, or subject. Separate holdings tables exist for hard copy, vector, and raster data (Hiland et al., 1992).

The Northwest Land Information System (NWLISN) is a cooperative project of several federal agencies in conjunction with state agencies in Oregon and Washington. “The NWLISN is not administratively, financially, or politically tied to any one group or agency” (Wright & Lee, 1992). It is a “metaGIS” implemented using Environmental Systems Research Institute’s (ESRI) ARC/INFO GIS software package on a SunSparc workstation. It contains cataloged information about the digital spatial holdings of its members. A “Spatial Data Index” (SDX) lists holdings by both topic and by spatial extent. SDX consists of three parts: an Agency Database containing information about the data holding agencies participating in the project along with information about the specific data holdings of that agency; a Thematic Holdings Database “describing individual data sets reported by participating agencies;” and a Spatial Cross-Reference Database contain- ing information about individual user tiling schemes which allow users to request data

594 S. FRANK

information in their native tiling scheme. Data quality statements are available but are not independently verified (Vrana, 1992).

The Facility for Integrating the National Directory of Australian Resources (FINDAR) is a “computer-based directory of sources . . . describing primarily what data is available, who is responsible for the data, where it is located, and how it might be accessed” (John- son et al., 1990, p. 1). It is maintained by the National Resource Information Centre (NRIC) of Australia (Johnson et al., 1991, p. 123). Searches may be done interactively with a graphic search window (E. P. Shelley, personal communication, June 12, 1992; Walsh, 1991), through keyword queries, or through dataset attribute fields (Shelley & John- son, 1991). Spatial searches can be done against the background of large-scale standard map sheet areas, census polygons, state boundaries, or the Australian master place names list (Shelley & Johnson, 1991).

The NexpRI system is a cooperative effort by four universities in the Netherlands, under the auspices of the Netherlands organization for Scientific Research (NWO), to develop a database of GIS resources. The effort distinguishes four types of GIS activities: projects, expertise, services, and products. The organizations which provide these aspects of GIS are the primary source of information in the database (Ottens & Zandee, 1991; van den Doel, 1992). Three criteria must be met for inclusion in NexpRI: The data must be digitally available; the data must have spatial reference to the earth; and the data must be externally accessible. Datasets that were not designed for GIS and that have no graphic representation may be included in NexpRI (van den Doel, 1992).

It appears that many groups and organizations are preparing spatial data cataloging systems that are somewhat unique to individual data holdings and data needs. The con- sensus among the spatial data user community seems to be that each spatial dataset cata- log description should contain some information about the “fitness for use” of that data (Onsrud & Rushton, 1992) and some means of identifying the spatial extents of items cataloged. Efforts are underway to define minimum content standards for cataloging digital spatial data that would convey information about fitness for use (FGDC, 1992; United States Geological Survey [USGS], 1990), but these efforts are still in the early stages of development. There are several approaches to providing fitness criteria. First, one might include fitness for use information for each and every catalog entry. Second, one might use fitness criteria itself as another classification scheme, separating entries into separate classifications in an index, into separate indices, or even into separate catalogs. The former approach is best suited when many datasets are produced independently for special needs using observation equipment especially suited for that particular need. The latter approach would make sense when large numbers of datasets are produced in a similar manner using consistent observation equipment. Many private company products would fall into the first category, while most government products would fall into the second category. In refer-

ence to the latter, Cornelius argues that

many small catalogues appropriate for particular groups of clients may be more suit- able than global, general directories . . . coherence and consistency in a small directory is likely to be greater. . . knowledge of the data requirements and information required by a target audience may lead one to produce a more appropriate directory . . a glob- ally useful directory . . . will necessitate the inclusion of vast amounts of information about each dataset. (Cornelius, 1991, p. 49)

Burnhill (1991) also cites the need for a union catalog that combines the catalog informa- tion distributed across individual catalogs. This approach appears to assume much homo- geneity among data catalogs that seems inconsistent with Cornelius’s argument for diverse data catalogs. Lynch and Preston note that “the vision of a single, simple unified direc- tory of networked resources. . . seems somewhat chimerical, if this directory is to support much beyond the simple known item lookup function” (Lynch & Preston, 1992, p. 21).

The description of spatial extent is being handled by several means. Some systems are employing the use of spatial tiles that are uniquely identified and assigned to datasets appearing within their spatial extent. Other systems use one or more location coordinate values or other explicit spatial referencing values as part of the description. Still nther sys-

Cataloging digital geographic data 595

terns allow the use of place names for spatial referencing. These approaches may be com- bined in various manners in many digital spatial data cataloging systems.

2.5 Trends in cataloging and information retrieval automation There appear to be three overlapping areas of research in automated cataloging:

(1) fully automated cataloging, (2) computer-assisted cataloging, and (3) nontraditional cataloging (Weibel, 1992). Automated cataloging attempts appear to be concentrated in the area of artificial intelligence with “selective sensitivity to various important properties” of data (Metzler, 1992, p. 10) and in document image analysis, which uses algorithms to extract semantic information from text, graphics, and images (O’Gorman & Kasturi, 1992). Of special interest to digital spatial data users is research into the automated extraction of metadata from the header information of remotely sensed image files (Cromp, 1989) and the automated abstraction of multiframe images (Rorvig, 1993). Computer-assisted systems primarily aim to guide the user to catalog specific items correctly (Ercegovac & Borko, 1992a).

Research in the area of information retrieval seems to fall into similar overlap- ping areas:

1. Heuristic, or belief revision, algorithms that aim to have databases “intelligently” understand user needs (Cawsey et al., 1992; Jennings & Higuchi, 1992);

2. “Multiversion” indexing and retrieval query formulation schemes which allow users numerous points of entry to items that might be referenced differently on differ- ent systems (Frants et al., 1993; Yannakoudakis et al., 1990); and

3. Image retrieval for human visualization (Bourne, 1989; McDonald & Blake, 1991) or similarity matching (Jagadish, 1991; Lee & Hsu, 1992).

A special area of research for digital spatial data cataloging has been development of the “metaGIS.” This approach attempts to use the tools of spatial databases and spatial data analysis as a basis for cataloging spatial metadata (Marble, 1987; Ruggles & Newman, 1991; Vrana, 1992). This approach would seem to have a higher set of requirements than catalog- based systems (Akervall et a/. , 1992) but would be useful to many experienced spatial data users. The greatest value of metaGIS systems will be their ability to interact with catalog- based systems and thus not restrict use over a global network environment. One trend in this approach has been the development of “data browsers,” which allow searchers to graphically browse datasets in “map” form from one or more databases (Blue & Lee, 1992; Evans et al., 1992; Handley et al., 1992). The current focus of such attempts seem to be

focused on small collections of spatial data such as might be contained within single orga- nizations. However, possibilities of expansion to internet-wide applications are foreseeable, although, as noted earlier, many characteristics of data are not readily nor clearly apparent from merely viewing the dataset.

2.6 Summary Cataloging and indexing serve as valuable tools to determine (1) if an item exists;

(2) where it is located; and (3) the possible use and value of the item. While many cata- loging and indexing systems have been transferred from paper to computer form, these systems still rely heavily on human interpretation of information to determine where an item is located and of what use and value an item might have. In the environment of elec- tronic networks, the distinctions between these functions may blur as users come to expect computers to not only find information, but to also retrieve and evaluate that information. In the case of geographic information, users may expect sophisticated online retrieval systems that allow them to define areas of interest by one or more methods of geocoding. However, differences in geographic retrieval system concepts and interfaces make it difficult for users to learn to use a wide variety of systems. Methods are needed to help standard- ize these systems. These methods should also be extendible so that users might eventually find, retrieve, and evaluate information in a single apparent operation. These problems are beginning to be addressed by the networking community, as seen in Section 3.

596 S. FRANK

3. NETWORKING

The major force in the development of the proposed National Information Infrastruc- ture (NII) is the electronic network as typified by the Internet. The Internet is a loosely coupled collection of local, midlevel, and national electronic networks stretching world- wide. One current estimate cites 10 million users connected through more than 9,000 lower level networks across 102 countries (Markoff, 1993). While there appears to be no general consensus of what actually comprises the Internet (Krol & Hoffman, 1993), many believe that the Internet itself is not an entity so much as it is a concept. Some believe that the Internet is restricted to those networks using the transmission control protocol/internet protocol (TCPAP), which provides the basis for interoperating between different hardware systems and providing electronic mail services, file transfer protocols using File Transfer Protocol software packages, and remote system login using software packages such as TELNET. Networks that do not provide all these services are usually not considered part of the Internet (Kahin, 1992a; Krol, 1992). Others believe that “the Internet is not de- limited by the internet protocol but by the broader notion of interoperability. Parts of the Internet support multiple protocols” (Kahin, 1992a, p. 7). Gateways allow two way connection between the IP networks and many non-IP networks, such as DECnets and Bitnet (Krol, 1992).

Of special interest to those seeking information on the network are the support levels offered by various tools used to locate and disseminate that information. Stephenson (1988) discusses the problem of information access and identifies six methods by which users find information: by retrieval, where the user knows what information is needed, knows where that information resides, and understands how to obtain it; by searching, where the user knows what information is needed, believes that the information is in the system, and is unsure how to obtain it; by browsing, where the user is unsure what information is needed and is unsure if the information is in the system; by scanning, where the user is not look- ing for particular information but is scanning for interesting items; by exploring, where the user is testing the limits, constraints, and capabilities of the system; and by wander- ing, where the user has no structure in his or her search and does not understand the system being accessed. Ideally, the perfect system would allow the user to perform the first five functions while avoiding the last function. These concepts also seem useful for eval- uating network mechanisms as related to spatial data searchers.

3.1 Internet services The Internet is also defined by its functionality- the ability to provide electronic mail

services, file transfers, and remote login. Computer networks that do not provide all these services are usually not considered part of the Internet (Kahin, 1992a). The most familiar networking protocol currently used in the United States is the transmission control pro- tocol/internet protocol (TCP/IP) (Kleinrock, 1992; Krol, 1992), which provides the basis for interoperating between different hardware systems and providing electronic mail ser- vices, file transfer protocols using File Transfer Protocol software packages, and remote system login using software packages such as TELNET.

3.1.1 File transfer protocol. File Transfer Protocol (FTP) software allows users to transfer files between computers over a network running TCP/IP. Many network sites allow open access to files on their systems by employing anonymous FTP privileges. The user can log in from a personal computer or an account on a mainframe to special system accounts as “anonymous” and receive, and in special circumstances send, files from and to the FTP site. Many anonymous FTP sites restrict who may send files to some directories as a matter of system security, allowing those logging on to send files only by use of a spe- cial password or anonymously to certain authorized directories (Deutsch et al., 1993; Kahin, 1992b). FTP users are restricted to sending and receiving files across the network. One advantage of anonymous FTP is that users are given the size of the files listed in the direc- tories, allowing them to make available space on their own systems to import the file into and to estimate the time to download the file. Disadvantages of FTP are that only file names, sizes, and dates may be listed. The user must hope that whoever prepared the

Cataloging digital geographic data 597

file gave it a name relevant to its content. Many anonymous FTP sites have files named “readme,” “index,” or some variation thereof (Dillon et al., 1993) which contains descrip- tive information about the files listed at the site, but this practice is voluntary, not stan- dard. The user also has no way of knowing whether persons at the anonymous FTP site have prepared the files themselves or are merely reposting material found elsewhere on the Internet. Many files are duplicated and stored at numerous sites across the net. Similarly, differing versions of many documents also reside at different sites. The date listing seen using FTP is the date the file was posted to the system, not the actual date the file was created, leading to problems in trying to use file date alone to determine the currency of the version of a document.

Many anonymous FTP sites contain spatial datasets and software for processing spa- tial data (Nyman & Sealy, 1993) but offer no or limited means of evaluating those data- sets. FTP functionality seems to lead itself well to data retrieval. However, a network of thousands of FTP sites would be formidable to search, browse, or scan for useful data. Users must have knowledge that the FTP site exists, what its address is, and what types of information it might hold. Many FTP sites are dynamic in terms of what information is held and are intended primarily to satisfy a small group of users. A particular FTP site may contain only one file of spatial data among hundreds of other files. Nor do FTP applications seem to satisfy cataloging needs, which are limited to file format when they are provided at all.

3.1.2 Remote login. TELNET software allows users to log in remotely from one computer to any other computer on the network. The TELNET user may do any opera- tion that he or she would normally perform if actually sitting at the remote system. In many cases, users access databases through interfaces that automatically appear when the user logs in. The user is normally asked to define the hardware from which he or she is access- ing the account so that the local hardware can emulate the remote system hardware. Again, users may be restricted to a single menu interface when logging in. Users may not trans- fer files using TELNET but may be allowed the privilege of sending themselves a file via electronic mail. Otherwise, they must reaccess the site using FTP software.

Many library catalog databases-Online Patrons Access Catalogs (OPACs) - are now available over the Internet and allow users to view the contents of remote information resources. However, merely developing catalog databases and placing them online with TELNET access does not provide us with an easily usable resource both because of the differing user interfaces that must be learned and because users have no way to ascertain quickly which resources they should access to find information relevant to their needs. Even when users know which resources may have relevant material, they must access each sys- tem individually and perform individual searches on each system. These OPACs also vary in searching ability and features (Kalin, 1991; Markuson, 1991), meaning that users may have to develop different searching strategies for different systems based on knowledge of the particular system they are using.

Many online cataloging databases are also available via TELNET services, including spatial data catalogs such as USGS’s Global Land Information System (GLIS) and NASA’s Global Change Master Database (GCMD). Such systems allow users to find if datasets they are seeking exist and to do some evaluation of those datasets (if descriptive data about the datasets exist), but users must use alternate means to actually obtain the data, usually by placing an order and receiving a tape or disk containing the dataset. The problems of OPACs are also found with spatial databases. Again, a network of thousands of remote spatial databases would be difficult to search, browse, or scan.

3.1.3 Archie. FTP does not allow the user to effectively search for files. A service called “archie” has been developed to allow users searching ability for files located at anon- ymous FTP sites on the network. In essence, archie collects information by entering anonymous FTP sites across the network, creates an inventory of the files and directories found at each site, and then places the inventory information into a database. Several archie services are each responsible for certain segments of the Internet, with information gath- ered on a regular basis and traded between the various archie systems. Users may access any one of the archie servers and query full or partial file or directory names. Archie will

598 S. FRANK

return a list showing the number of matches to the query that were found and the path to each found item. The path information includes the address of the anonymous FTP site and the directory path to the file or directory found. Archie can be accessed by TELNET software, by client programs, by electronic mail, or by the Prosper0 distributed file system (Deutsch, 1992; Obraczka et al., 1993).

Originally a purely voluntary system (Deutsch, 1992), archie maintenance presented potential problems if funding issues become pertinent. Archie has since gone commercial and is now operated by the Bunyip Corporation. Archie makes no distinction between found files or directories in the search process. The user is expected to be able to distin- guish between directory listings and file listing by interpreting the results of the query. More than one file or directory may be listed for each FTP site. In a bare-bones network envi- ronment, the user would then need to access each site and search through the directories listed for relevant files or download the listed files to determine if this was in fact the file desired. Archie only collects file and directory names, not information about what those files or directories may contain.

Planned improvements for archie include deveIopment of an “update interrupt,” which would allow sites to forward any changes automatic~ly to an archie server as those changes are being made. A second planned development is to allow users to register requests for specific terms and automatically receive messages when updates to those terms are made at the archie server. There is also development of an Indexing Services Layer, which would allow archie-like servers with specialized indexing services that are more responsive to the needs of specific groups of information users (Deutsch, 1992).

Archie solves the problem of searching through thousands of FTP sites, and the proposed updates to archie would help users to browse and scan relevant sources of data. But archie does not address the problem of cataloging to help users evaluate such data or data sources.

3. I .4 Gopher. In more sophisticated networking environments, such as the Gopher and TurboGopher systems developed by the University of Minnesota, allow users to access FTP and TELNET services through a client-server service. Gopher and TurboGopher are based on the file system metaphor and present information as a hierarchy of directories and files which users can browse for information (Alberti et al., 1992; Obraczka et al., 1993; Wiggins, 1993). Gopher is itself a protocol for distributed document search and retrieval and is based on Internet TCP/IP data delivery services, not the OS1 protocols. Users can still access the abilities of anonymous FTP and TELNET services, but using the system is presented in a much more str~ghtforward manner. The specifics of this presen- tation depend on the hardware used and the version of the Gopher client interface resid- ing on the users machine. For example, archie search services can still be called from within Gopher, but the results may be interpreted by the Gopher client software. Gopher will con- struct icons representing pointers to the directories and files answering the query. The user merely has to click on these icons to search directories or retrieve files rather than typing in access codes and passwords to the FTP site as under the TELNET approach.

The Gopher system also offers alternatives to archie, called “veronica” and “jughead,” which not only allow the user to perform searches for files on Gopher servers but also to query what services those Gopher servers might provide. This includes TELNET sites which can be accessed through Gopher. Veronica and jughead are limited to searching sites with Gopher servers installed and, like archie, are limited to searching for titles or names of items with no ability to retrieve relevant information about the item. A recent gopher develop- ment , called “gopher + , ” allows gopher servers to attach metadata to gopher resources, such as file format or publisher’s name. Gopher+ clients can query this metadata for additional information about the resource being selected. Gopher+ is designed to be downwardly compatible with the nonenhanced versions of gopher (Wiggins, 1993). There is some dis- cussion of using the gopher+ metadata to allow refined searches for Veronica and jughead.

Gopher and TurboGopher can be used to access spatial data catalog information at the Consortium for International Earth Science Information Network (CIESIN). They can also manage TELNET sessions with online spatial metadatabases. Gopher clients are not constrained by what they may do with resources once they are downloaded. Client appli-

Cataloging digital geographic data 599

cations may launch external processes, such as image browsers or data conversion software. However, all gopher clients receive the same information (Wiggins, 1993).

Gopher and TurboGopher are excellent searching, browsing, and scanning tools that are well suited to small network environments, such as university campuses, but do not seem to be extendible for use to quickly and accurately find information over a national or global network environment. Nor do Gopher or TurboGopher address the data cata- loging problem.

3.2 Open systems technology The Internet can also be thought of as computers talking to computers. However,

differences in computer hardware and operating systems across the Internet have brought about the development of open systems. “An open system is a system capable of commu- nicating with other open systems by virtue of implementing common international stan- dard protocols. . . . However, an open system may not be accessible by all other open systems” (National Institute of Standards and Technology [NIST], 1988, p. viii). A stan- dard for open systems has been implemented by the International Standards Organiza- tion (ISO) as the Open Systems Interconnection (ISO). The standard describes the seven layers of functions, with each lower layer providing support functions for the layers above it. The lower four layers provide for the reliable transmission of data while the upper three layers detail session, presentation, and application functions (Denenberg, 1990). The lower layers allow, but are not necessary for, the implementation of the upper layers. For exam- ple, the OS1 protocol application for Information Retrieval, ANSI Standard 239.50, is being implemented on the Internet using TCP/IP interconnection protocols, not the OS1 networking protocols for which it was originally intended. Another protocol under devel- opment is the File Transfer, Access, and Management (FTAM) protocol (Davison, 1990; Planka, 1990), which seems to compete with the Internet FTP protocol. “[PIart of the prob- lem with current open systems and standards: that they are not standard. . . . Not every- thing is standard, may never be, and maybe never should be” (Szymanski & Morris-Jones, 1991, p. 12). This separation of functions allows network system implementors great flex- ibility in mixing and matching high-level network protocols to low-level network protocols already in use.

New trends in networked database retrieval are oriented toward the concept of client- server applications. Users access client machines, which then connect to server machines where desired information or software resides. The user menu resides on the client machine, so the user may see only one interface no matter how many server machines are connected. The user may have more than one interface residing on the client machine, each interface designed for specific tasks, but each task appears to be performed the same no matter which server is used. The client-server paradigm offers many advantages: (1) Users learn only one interface; (2) the user does not need to know how the server system works; (3) client soft- ware can be upgraded without modifying server software; and (4) server software can be upgraded without modifying client software (Dangermond, 1992; Lee & McLaughlin, 1991; Sinha, 1992; Szymanski & Morris-Jones, 1991).

3.2.1 information retrieval protocols and WAZS. Many client-server application pro- tocols are being developed under the auspices of the Open System Interconnection (0%). These protocols control the semantics of information exchanged, while lower level presen- tation protocols control the syntactic representations of the information exchange (Davison, 1990; Denenberg, 1990). The aforementioned Information Retrieval protocol is one such set of standards. The standard has been defined as ANSI standard 239.50 and as IS0 standards 10162 and 10163 (Lynch, 1990; NIST, 1988). These standards do not specify or constrain how information retrieval is implemented but define the functions implementa- tions should support and the rules for exchanging information. An abstract database schema model is used to define a common basis for all information databases. The Infor- mation Retrieval protocol was developed specifically for applications accessing remote data- bases (National Information Standards Organization [NISO], 1991). Client software must map user requests to a proper exchange format that is then set to one or more servers. Each server must then map the request from the exchange format to its internal database rep-

600 s. FRANK

resentation and process the query. Server results are then mapped to the exchange format and sent back to the client where they are decoded for client use. The results from one query may be used in formulating subsequent queries. Such activity is termed “relevance feed- back” and can be a powerful tool for narrowing or expanding information searches (Stan- fill, 1991; Stein, 1991). Actual implementation for 239.50 applications depends on the development of user profiles that can map certain semantic needs into abstract models. 239.50 standards for OPACs are under development, and experimentai implementations are being tested at some university libraries (Riddle, 1993). The author was unable to find similar developments being proposed to digital spatial data, although the Federal Geo- graphic Data Committee’s proposed Content Standards for Spatial Metadata seem to be adaptable for such a purpose (Federal Geographic Data Committee [FGDC], 1992).

The 239.50 protocol allows computers to share understandings about the semantics of the data being passed from system to system. The 239.50 protocol’s strongest point is that it will allow system implementors the capability of layering protocol applications on top of existing database systems, thus allowing users to take advantage of specific capa- bilities of certain databases without requiring that all databases support such capabilities. New versions of the protocol are expected to contain provisions for EXPLAIN services which will allow clients to learn about the abilities, access points, and transfer formats of server resources (Lynch, 1990; Lynch & Preston, 1992).

The most popular adaptation of the 239.50 protocol has been the Wide Area Infor- mation Server (WAIS). WAIS is a full text document server enabling users to simultane- ously search multiple servers for information. WAIS is stateless, meaning each transaction between a WAIS server and a WAIS client is treated as a separate network transmission, thus keeping network traffic to a minimum. WAIS is based on full text search and retrieval using an inverted file format that allows the client to weight responses based on the num- ber of occurrences of provided keywords. Multiple sites can be searched, but the user must explicitly specify each of those sites prior to performing the query. WAIS supports the con- cept of relevance feedback, allowing users to formulate new queries automaticalIy from the results of previous queries (Kahle & Medfar, 1991; Machovec, 1992; Markoff, 1991). The U.S. Geological Survey is currently developing a WAIS application which will allow users to base queries on spatial criteria (e.g., by defining an area of interest on an index map shown on the computer screen or by entering geographic coordinate values).

WAIS servers have evolved into three distinct versions. The origina WAIS software, developed by Thinking Machines, Inc., does not allow Boolean searching. A version devel- oped at the Indiana University Department of Biology expands WAIS to include Boolean queries. A freeWAIS version, developed by the Clearinghouse for Networked Information Discovery and Retrieval (CNIDIR), allow Boolean searches and “wildcard” entries, such as “geograph , *” to return all document instances with words “geography,” “geographic,” “geographical,” or any other word with the root “geograph” (Perez, 1993).

The WAIS system does contain a “directory of servers” which allows server implemen- tors to describe their services (Kahle & Medlar, 1991; Obraczka et al., 1993). However, this approach is voluntary, arbitrary, and does not require updating as the server itself is updated. Second, there is no provision in the protocol to allow database servers to inform clients as to why a zero “hit” result might have occurred. For example, some database sys- tems will notify users that zero “hits” occurred because of the mismatch of a single term, a word which may be misspelled or require an alternative spelling to be compatible with the database (Lynch, 1990). This problem is reportedly being fixed in the new version of the 239.50 protocol. Many words in the English language alone are spelled differently by different societies: for example, center versus centre or equalize versus equalise. Spatial data applications based on mathematical systems will require extensions to the 239.50 proto- cols. However, even then, systems attempting to use 239.50 protocols will encounter sim- ilar problems. For example, street addresses can be listed as “Fifth Street” or “5th Street.” Mathematical coordinates describing spatial extents can be based on a particular State Plane Coordinate (SPC) system or on the Universal Transverse Mercator (UTM) system. Differ- ences between spatial representation methods will not be resolved until such differences are somehow mapped into abstract data models useful for many possible implementations.

Cataloging digital geographic data 601

WAIS applications show exciting possibilities for searching digital spatial data. Cat- alog information residing within the datasets could be searched, browsed, or scanned. Catalog descriptions and data dictionaries could be compared for relevance feedback. However, the current WAIS solution for accessing multiple sources is very weak. Users accessing the “director of servers” descriptions might be unsure from those descriptions whether a certain server is useful or not without actually accessing that server. WAIS does allow users to limit their searches to certain servers which may or may not be desirable.

3.2.2 World Wide Web. The World Wide Web (WWW or W3) is a network retrieval mechanism that employs the techniques of hypertext. WWW consists of documents and links. WWW uses the HyperText Transmission Protocol (HTTP), which allows index searches. Indexes are documents which can be text searched and linked to other documents or to places in other documents. The links are made using marked up text using a Stan- dard General Markup Language (SGML) format, called the HyperText Markup Language (HTML), to tag documents and objects within documents. Based on the concepts that prompted the development of WAIS, WWW is also stateless but does not support rele- vance feedback (Berners-Lee et al., 1992; Obraczka et al., 1993).

Like WAIS, WWW shows many promising trends for searching, browsing, and scan- ning digital spatial data. Hypertext links may allow spatial datasets to be connected to cat- alog information and to reports detailing information generated from the datasets. The absence of relevance feedback for searches seems to make it less desirable for users wish- ing to narrow or expand their searches. However, the placement of hyperlinks themselves appears to be subjective, depending on the use of the dataset. The creation and maintenance of such hyperlinks and the updating of links to new information appears to be formida- ble, especially given the long life of spatial data.

3.2.3 Directory service protocols. The X500 Directory Service is another OS1 appli- cation protocol under development. This protocol is a “standard for a global, logically centralized but physically distributed, electronic network directory” (Planka, 1990, p. 94). The directory itself is a database which allows users normal database functions.

A directory is a collection of attributes (i.e., information) about, and relations between, a named set of addressable objects within a specific context. A directory can be viewed as a data base containing instances of record types. The most typical relationship between a directory user and the directory itself is that of an information user and an informa- tion provider. The user supplies an unambiguous or ambiguous key to the directory, and the directory returns information labeled by the key. The directory user may filter the available information to access only the most essential fields. (NET, 1988, p. 52)

Most X500 applications appearing on the Internet to date seem to be limited to supplying “white pages” telephone numbers and e-mail addresses of people in various organizations. Theoretically, the X500 Directory Service applications could be expanded to a cataloging system, since there appears to be no limit to the information one may store in the directory. However, unlike the 239.50 protocol, searchers are unable to control their searches by using relevance feedback (Lynch, 1990). X500 applications might also theoretically run in conjunction with 239.50 applications, supplying information about individual database services to applications based on 239.50 protocols (Planka, 1990). Directory Services implementations could serve to help users search, browse, scan, and evaluate digital spatial information.

3.3 Trends in networking technology Knowbots, software mechanisms used to sort information about the Internet and

present moderated views to users in a client-server environment, are another approach to finding information over electronic networks. Knowbots can be implemented in two ways. In one approach, knowbots can be located on server systems. Clients would send a list of preferences to the knowbot, which would then send back digested abstracts of the server database. This would reduce network traffic, but it presumes a wide understanding of the vast array of clients that would access such systems. Alternatively, knowbots can reside on clients’ systems, pulling in vast amounts of knowledge from multiple servers for digestion

602 S. FRANK

and presentation. This would require large computing and storage capacities on the client machine and would create significant network traffic (Klingenstein, 1992). Knowbots help users manage the network environment and would appear to offer at least some limited means of information content evaluation for digital spatial datasets.

The library community has been working with extensions to the AARC and MARC to permit the description of networked resources (Dillon et al., 1993; Lynch, 1993b; McCallum, 1991). The Internet Engineering Task Force (IETF) Working Group on Doc- ument Identifiers has also been working on standards for identifying networked documents and resources. This approach is designed to bridge the needs of many diverse current and future resource search and retrieval mechanisms such as archie, Gopher, WAIS, and WWW. It attempts to identify both objects and specific locations within objects that might accommodate the hypertext links needed by services such as WWW (Lynch, 1993a; Berners- Lee, 1992). The IETF seems to have embraced the idea of a Uniform Resource Identifier (URI), a structured string that contains both the resource name and the resource address of online resources.

The IETF is also developing templates which are designed to capture information about online resources. These templates are designed to be voluntary and will capture specific resource information on a case-by-case basis. The templates will resemble catalog card entries and will contain resource URIS so that automatic retrieval of resources can be performed. The templates are expected to be adaptive to all levels of resources, including those listed earlier (Deutsch & Emtage, 1993a, 1993b; Deutsch et al., 1993). Such infor- mation is expected to permit advanced automated data captured similar to archie. One X500 application, Whois++, is extending the template application to build a hierarchical information directory that will allow users to either actively or transparently navigate lay- ers of resource information to locate the particular resource needed (Weider et al., 1993).

3.4 Summary Interesting and exciting new methods of finding information across a network of

sources are being developed. However, these methods lack the robust features of traditional cataloging that allow users to determine what “fitness for use” online resources might possess. These problems are being addressed by the Internet community. However, the users with special needs, such as those wishing to retrieve geocoded information, will need to work beyond the general issues facing the National Information Infrastructure. The resource discovery issues for spatial information are being raised by the FGDC in a series of meetings addressing the National Spatial Data Infrastructure (NSDI), an integral part of the NH.

4. CONCLUSIONS

An information infrastructure interconnecting thousands of sources of electronic data has spawned a new era in the Information Age. Digital spatial data, in the form of com- plex models of the real world, are a part of this new infrastructure. Users of digital spa- tial data will need guides to help them navigate this infrastructure to find data appropriate for their needs. The voluminous nature of spatial datasets suggests that data surrogates, such as data catalogs, may be useful for finding appropriate data.

Cataloging has progressed from a manual, to a computer-based, to a network-based environment. As users develop specialized cataloging systems to meet their individual needs for digital spatial data, there is a great need to coordinate these efforts so that informa- tion can be shared easily in a network environment. Digital spatial data users have special needs that must be addressed, including the ability to assess the fitness for use of data for their various applications and the ability to perform spatially based searches for data. To discover how to coordinate digital spatial data cataloging efforts, we need to look not only at developments in cataloging systems and the specialized needs of users but also at developments in networking technology. However, digital spatial data cataloging, digital spatial data needs assessments, and electronic networking are still in the early stages of development. Regardless, we need to begin investigating the range of practical alternative

Cataloging digital geographic data 603

scenarios for development and implementation of digital spatial data cataloging in order to achieve easily accessible and compatible sources of electronic information that will fill the needs of myriad information users of the National Information Infrastructure.

REFERENCES

Akervall, L.. Degerstedt, K., & Rystedt, B. (1991). Spatial metadata systems at the Nattonal Land Survey of Sweden. In I. Newman, D. Medyckyj-Scott, C. Ruggles, & D. Walker (Eds.), Metadata m thegeosctences (pp. 153-170). Loughborough, UK: Group D.

Alberti, B., Anklesaria, F., Lindner, P., McCahill, M., & Torrey, D. (1992). The fnternet Gopher Protocol: a Datributed Document Search and Retrieval Protocol (draft report) [machine-readable data file]. Minneapolis: University of Minnesota, Microcomputer and Workstation Networks Center (Producer & Distributor).

Al-Taha, K., & Frank, A. (1992). Temporal GIS keeps data current. In H.D. Parker (Ed.), 1991-1992 international GLS Sourcebook (pp. 384-388). Fort Collins, CO: GIS World.

Aronoff, S. (1989). Geographic information systems: a managementperspectrve. Ottawa, Ontario, Canada: WDL Publications.

Berners-Lee, T. (1992). Universal Resource Locators (draft report) [machine-readable data file]. Internet Engi- neering Task Force (Producer & Distributor).

Berners-Lee, T., Cailliau, R., Groff, J., & Pollermann, B. (1992). World-wide web: the information umverse [machine-readable data file]. Geneva, Switzerland: CERN (Producer and Distributor).

Blue, M., & Lee, Y.C. (1992). A browsing system for CARIS on a network. In The Canadian Conference on GIS Proceedings, (vol. 1, pp. 33-44). Ottawa, Ontario, Canada: The Canadian Institute of Surveying and Mapping & ‘The Inter-Agency Committee on Geomatics.

Borko, H., & Bernier, C.L. (1978). fndexrng concepts and methods. New York: Academic Press. Bourne, C.P. (1989). A review of technology and trends in document delivery services. In Proceedmgs of the

Ltbrary of Congress Network Advisory Committee Meeting (pp. 9-15). Washington, DC: Library of Congress.

Buckland, M.K. (1988). Library servtces m theory and context (2nd ed.). Oxford: Pergamon Press. Burnhill, P. (1991). Metadata and cataloguing standards: one eye on the spatial. In I. Newman, D. Medyckyj-

Scott, C. Ruggles, & D. Walker (Eds.), Metudata in the geosciences (pp. 13-38). Loughborough, UK: Group D.

Carpenter, M. (1992). LCGISN as a union catalog of cartographic material for coastal Louisiana. Louisiana Coastal GIS Network Newsletter, 2(l), 4-5.

Carver, L. (1989). Georeference information network. In Proceedings of the 1989 ASPRUACSM Annual Conference (vol. 1, pp. 175-180). Baltimore, MD: American Congress on Surveying and Mapping & Amer- ican Society for Photogrammetry and Remote Sensing.

Cawsey, A., Galliers, J., Reece, S., & Jones, K.S. (1992). Automating the librarian: belief revision as a base for system action and communication with the user. The Computer Journal, 35(3), 221-232.

Cornelius, S. (1991). Spatial data auditing. In I. Newman, D. Medyckyj-Scott, C. Ruggles, & D. Walker (Eds.), Metadata m the geosciences (pp. 39-54). Loughborough, UK: Group D.

Crawford, W. (1989). MARC for library use. 2nd ed. Boston: G.K. Hall & Co. Cromp, R.F. (1989). Automated extraction of metadata from remotely sensed satellite imagery. In Proceedings

of ACSM-ASPRS I991 Annual Convention (vol. 3, pp. 11 l-120). Baltimore, MD: American Congress on Surveying and Mapping & American Society for Photogrammetry and Remote Sensing.

Dangermond, J. (1992). Client/server approach to GIS Interfaces. In Proceedings of the AM/FMInternotronal Conference XV (pp. 480-486). San Antonio, TX: AM/FM International.

Davison, W. (1990). OS1 upper layers support for applications. Ltbrary Hi Tech, B(4), 33-42. Denenberg, R. (1990). Data communications and OSI. Library Hi Tech, 32(4), 15-32. Deutsch, P. (1992). Resource discovery in an Internet environment-the archie approach. Electronic Networking,

2(l), 45-51. Deutsch, P., & Emtage, A. (1993a). Data element templates for Internet informatton obJects (draft report)

[machine-readable data file]. Internet Engineering Task Force (Producer & Distributor). Deutsch, P., & Emtage, A. (1993b). Publishing information on the Internet with anonymous FTP (draft report)

[machine-readable data file]. Internet Engineering Task Force (Producer & Distributor). Deutsch, P., Emtage, A., & Marine, A. (1993). How to use anonymous FTP (draft report) [machine-readable

data file]. Internet Engineering Task Force (Producer & Distributor). Dillon, M., Jul, E., Burge, M., & Hickey, C. (1993). Assessing rnformatton on the Internet: towardproviding

library services for computer-mediated communicatrons (research report). Dublin, OH: Online Computer Library Center, Inc.

Ercegovac, Z., & Borko, H. (1992a). Design and implementation of an experimental cataloging advisor-mapper. Information Processing & Management, 28(2), 241-257.

Ercegovac, Z., & Borko, H. (1992b). Performance evaluation of mapper. Informatton Processing & Manuge- ment, 28(2), 259-268.

Evans, J., Ferreira, J., & Thompson, P. (1992). A visual interface to heterogeneous spatial databases based on spatial metadata. In Proceedings of the 5th International Symposium on Sparta1 Data Handling (pp. 282- 293). Charleston, SC: IGU Commission on GIS.

Fenly, C. (1992). Technical services processes as models for assessing expert system suitability and benefits. In F.W. Lancaster & Linda C. Smith (Eds.), Artificial intelligence and expert systems: will they change the library? (pp. 50-66). Urbana-Champaign: University of Illinois.

Federal Geographic Data Committee. (1992). Content standards for spatial metadata: draft. Federal Regrster, 57(224), 54,605-54,606.

604 S. FRANK

Federal Interagency Coordinating Committee on Digital Cartography. (1989). Coordinatron of digital cartographrc activitres m the federal government (final report). Washington, DC: U.S. Government Printmg Office.

Flowerdew, R. (1991). Spatial data integration. In M.F. Goodchild, D.W. Rhind, & D.J. McGuire (Eds.), Geo- graphical rnformation systems: prrnctples and applicatrons (vol. 1, pp. 375-387). Essex, UK: Longman Scientific & Technical.

Frank, A.U. (1992). Acquiring a digital base map: a theorettcal investigation into a form of sharmg data. Journal of Urban and Regtonal Information Systems, 1, 10-23.

Frants, V.I., Shapiro, J., & Votskunskit, V.G. (1993). Multiversion information retrieval systems and feedback with mechanisms of selection. Journal of the American Society for Information Scrence, 44(l), 19-27.

Global Data Catalog. (1992). InfoTEXT, spring, 7. Gould, C.C., & Pierce, K. (1991). Informatton needs m the sciences: an assessment. Mountam View, CA: The

Research Libraries Group, Inc. Guenther, O., & Buchman, A. (1990). Research issues in spatial databases. SIGMOD Record, 19, 61-68. Hagler, R. (1991). The bibliographic record and mformatron technology. 2nd ed. Chicago: American Library

Association. Handley, T.H., Li, Y.P., Jacobson, A.S., & Tran, A.V. (1992). DataHub-knowledge-based science data man-

agement. In Proceedings of ASPRS/ACSM/RT 92 (vol. 1, pp. 122-134). Washmgton, DC: American Con- gress on Surveying and Mapping & American Society for Photogrammetry and Remote Sensing.

Hiland, M.W., Wayne, L., & Streiffer, H.R. (1992). Louisiana coastal GIS network: relational database design for a spatially indexed cataloging system. In Proceedings of GIWLIS ‘92 (vol. 1, pp. 322-338). San Jose, CA: American Congress on Surveying and Mapping & American Society for Photogrammetry and Remote Sensing.

Hirschheim, R., & Newman, M. (1991). Symbolism and information systems development: myth, metaphor and magic. Informatron Systems Research, t(l), 29-62.

Hirshon, A. (1993). The convergence of publishmg and bibliographtc access. In A. Htrshon (Ed.), After the electronic revolution, will you be the first to go? (pp. l-8). Chicago: American Library Association.

Holmes, D.O. (1990). Computers and geographic information access. Meridian, 4, 37-49. Hufford, J.R. (1991). The pragmatic basis of catalog codes: has the user been ignored? Catalogmg and Classr-

ficatron Quarterly, 14, 27-37. Hunter, E.J. (1985). Computertzed cataloging. London: Clive Bingley. Jagadish, H.V. (1991). A retrieval technique for similar shapes. SIGMOD Record, 20(2), 208-217. Jennings, A., & Higucht, H. (1992). A browser with a neural network user model. Ltbrary Hi Tech, 10(1&2), 77-93. Johnson, D., Malmberg, H., Taylor, M., Hubble, C., & Lovett, B. (1990). The meta-database-a genenc approach

to information storage and retrreval. Canberra, Australia: National Resource Information Center. Johnson, D., Shelley, P., Taylor, M., & Callahan, S. (1991). The FINDAR Directory System: a meta-model for

metadata. In I. Newman, D. Medyckyj-Scott, C. Ruggles, & D. Walker (Eds.), Metadata in the geosciences (pp. 123-138). Loughborough, UK: Group D.

Kahm, B. (1992a). Overview: understanding the NREN. In B. Kahin (Ed.), Building rnformatron mfrastructure (pp. 5-14). New York: McGraw-Hill.

Kahin, B. (1992b). The NREN as information market: dynamics of public, private, and academic publishing. In B. Kahin (Ed.), Building mformation infrastructure (pp. 323-343). New York: McGraw-Hill.

Kahle, B., & Medlar, A. (1991). An information system for corporate users: wide area information servers. Unhne, September, 56-60.

Kahn, S.W. (1991). Support services for remote users of onlme public access catalogs. RO, Winter, 197-212. Kleinrock, L. (1992). Technology issues in the design of the NREN. In B. Kahin (Ed.), Budding mformatton

infrastructure (pp. 174-198). New York: McGraw-Hill. Klingenstein, K. (1992). A coming of age: the design of the low-end Internet. In B. Kahin (Ed.), Buridmg tnfor-

matron infrastructure (pp. 119-142). New York: McGraw-Hill. Krol, E. (1992). The whole Internet user’s guide and catalog. Sebastopol, CA: O’Reilly and Assoctates. Krol, E., & Hoffman, E. (1993). What IS the Internet (RFC 1462) [machine-readable data file]. Internet Engi-

neering Task Force, Network Working Group (Producer & Distrtbutor). Lai, P., & Gillies, C.F. (1991). The impact of geographic information systems on the role of spatial data librar-

ies. International Journal of Geographic Informatron Systems, 5, 241-251. Larsgaard, M.L. (1978). Map librarianshtp: an introduction. Littleton, CO: Libraries Unlimited. Larsgaard, M.L. (1992). Cataloging control concepts and inittatives. Paper presented at the Fourth Library

Information and Technology Association (LITA) Meeting, Denver, CO. Lee, S., & Hsu, F. (1992). Spatial reasoning and similarity retrieval of images using 2D C-string knowledge

representation. Pattern Recognitron, 25(3), 305-318. Lee, Y.C., & McLaughlin, J.D. (1991). Distributed land information networks: database management issues. CISM

Journal, 45(3), 353-363. Library of Congress. (1988). CJSMARC code list for relators, sources, description conventtons. Washington, DC:

Library of Congress. Lillywhite, J. (1991). Identifying available spatial metadata: the problem. In I. Newman, D. Medyckyj-Scott,

C. Ruggles, & D. Walker (Eds.), Metadata in the geosciences (pp. 3-12). Loughborough, UK: Group D. Lucas, S., & Rose, R. (1991). The provision of general access to spatially integrated data. In Proceedmgs of the

GIS PI Symposium (pp. 203-206). Vancouver, Brittsh Columbia, Canada: Forestry Canada. Lynch, C.A. (1990). Information retrieval as a network application. Library HI Tech, 32(4), 57-71. Lynch, C.A. (1993a). A framework for identifying, locating, and descrtbmg networked informatron resources

(draft report). Berkeley: University of California, Library Automation. Lynch, C. (1993b). Interoperability: The Standards Challenge for the 1990s. W&on Library Bulletrn, March,

38-42. Lynch, C.A., & Preston, C.M. (1992). Describing and classifying networked information resources. Electronic

Networking, 2(l), 13-23.

Cataloging digital geographic data 605

Machovec, G.S. (1992). WAIS: wide area informatron servers. Cambridge, MA: Thinking Machines, Inc. Marble, D.F. (1987). Design and implementation of a master spatial data indexing system: an international case.

In Proceedmgs of International Workshop on Geographic Information System Beijing ‘87(pp. 1-14). Beijing, China: International Geographical Union.

Markoff, J. (1991). For the PC User, Vast Libraries. New York Times, July 3, Cl. Markoff, J. (1993). Building the Electronic Superhighway. New York Times, January 24, 3-1,3-6. Markuson, B.E. (Ed.). (1991). Networks for Networkers II Conference. Washington, DC: Library of Congress. McAbee, J.L., III (1992). The importance of “openness” to data for GIS applications. In Proceedings of EGIS

‘92 (vol. 1, pp. 163-172). Munich, Germany: EGIS Foundation. McBride, R.A., Davis, D.W., Jones, F.W.. Byrnes, M.R., Braud, D., Hiland, M.W., Lewis, A.J.. & Streiffer,

H.R. (1991). Louisiana coastal geographic information system network (LCGISN): access to spatial data. Meridian, 6, 29-43.

McCallum, S.H. (1991). Dicttonury of data elements for online information resources (MARBI discussion paper no. 54) [machine-readable data file]. Washington, DC: Library of Congress (Producer); Internet: Public- Access Computer Systems Forum (PACS-L) (Distributor).

McDonald, K.R., & Blake, D.J. (1991). Information management challenges of the EOS data and information system. In Proceedings of 1991 ACSM-ASPRS Annual Conference (vol. 3, pp. 258-267). Baltimore, MD: American Congress on Surveying and Mapping & American Society for Photogrammetry and Remote Sensing.

Medyckyj-Scott, D. (1991). User-oriented inquiry facilities. In I. Newman, D. Medyckyj-Scott, C. Ruggles, & D. Walker (Eds.), Metadata m the geosciences (pp. 85-112). Loughborough, UK: Group D.

Metzler, D.P. (1992). Artificial intelligence: what will they think of next? In F.W. Lancaster & Linda C. Smith (Eds.), Arttficial intelligence and expert systems: wrll they change the library? (pp. 2-49). Urbana- Champaign: University of Illinois.

National Information Standards Organization. (1991). ANSZ 239.50 Version 2 (third draft) [machine-readable data file]. Washington, DC: National Information Standards Organization (Producer); Cambridge, MA: Thinking Machines Corporation (Distributor).

National Institute of Standards and Technology. (1988). Government open systems interconnection profile (GOSIP) (FIPS PUB 146-I). Washington, DC: U.S. Government Printinp Office.

Newman, I. (1991). Data dictionaries, information resource dictionary systems and metadatabases. In I. New- man, D. Medyckyj-Scott, C. Ruggles, & D. Walker (Eds.), Metuduta in the geoscrences (pp. 69-84). Lough- borough, UK: Group D.

Norman, D.A. (1988). The destgn of everyday thmgs. New York: Doubleday/Currency. Nyman, L., & Sealy, V. (1993). GIS FAQ: frequently asked questtons [machine-readable data file]. Internet:

Geographic Information Systems Forum (GIS-L) (Distributor). Obraczka. K., Danzig, P.B., & Li, S. (1993). Internet resource discovery services. Computer, 26(9), 8-22. O’Gorman, L., & Kasturi, R. (1992). Document image analysis. Computer, 25(7), 5-8. Onsrud, H.J., & Rushton, G. (1992). Institutions sharing geographrc information (tech. rep. no. 92-5). Santa

Barbara: University of Califorma, National Center for Geographic Information and Analysis. Ottens, H.F.L., & Zandee, R.H. (1991). The NexpRI information bank: a meta-information system on GIS activ-

ities in the Netherlands. In I. Newman, D. Medyckyj-Scott, C. Ruggles, & D. Walker (Eds.), Metudata m the geosciences (pp. 139-151). Loughborough, UK: Group D.

Pascoe, R.T., & Penny, J.P. (1990). Construction of interfaces for the exchange of geographic data. Internatronal Journal of Geographrc Information Systems, 4, 147- 156.

Perez, E. (1993). Hrnts and warnings about usrng “WAIS”software for seurchrng [machine-readable data file]. Internet: Library Gopher List (G04LIB-L) (Distributor).

Planka, D. (1990). Network directory services. Library Hi Tech, 32(4), 93-104. Riddle, P. (1993). Cataloging and integrating Internet resources [machine-readable data file]. Internet: LIBRARY

GOPHER LIST (G04LIB-L) (Distributor). Rivera, D., & Wood, A.A. (1992). Results of Congress of Cartographic Information Specialists Associations Sur-

vey [machine-readable data file]. Internet: Maps and Air Photo Systems Forum (MAPS-L) (Distributor). RLG Enters New Sphere with Geoinformation Project. (1989). The Research Libraries Group News, pp. 3-9. Rorvig, M.E. (1993). A method for automatically abstracting visual documents. Journal of the American Socr-

ety for Information Science, 44(I), 40-56. Rubin, T. (1992). Using common database and spreadsheet/plotting programs as inexpensive geographic infor-

mation systems. In A.I. Johnson, C.B. Petersson & J.L. Fulton (Eds.), Geographic information systems (GIS) and mapping-practrces and standards (pp. 97-105). Philadelphia: ASTM.

Ruggles, C., & Newman, I. (1991). The MRRL’s meta-information retrieval and access system. In I. Newman, D. MedyckyJ-Scott, C. Ruggles, & D. Walker (Eds.), Metuduta m the geosciences (pp. 187-210). Lough- borough, UK: Group D.

Shelley, E.P., & Johnson, B.D. (1991). Towards a national directory of natural-resources data. In Proceedings of the National Conference on the Management of Geoscience Informatron and Data (pp. 19-29). Adelaide, Australia: Australian Mineral Foundation.

Sinha, A. (1992). Client-server computing. Communicatrons of the ACM, 35(7), 77-97. Stanfill, C. (1991). Massively parallel information retrievalfor wide area information servers. Cambridge, MA:

Thinking Machines Corporation. Stein, R.M. (1991). Browsing Through Terabytes. Byte, May, 157-164. Stephenson, G.A. ‘(1988). Knowledge browsing-front ends to statistical databases. In The Fourth International

Workrng Conference on Statistical and Scientific Database Management, 339 (pp. 327-337). Rome, Italy: Springer-Verlag.

Szymanski, W.J., & Morris-Jones, D.R. (1991). The role of open systems in a GIS: costs, risks, and benefits. In Proceedings of the I991 Geographtc Information Systems (GIS) for Transportatron Symposium (vol. 1, pp. 1 l-22). Orlando, FL: American Association of State Highway and Transportation Officials.

606 S. FRANK

Thapa, K., & Bossier, J. (1992). Accuracy of spatial data used in geographic information systems. Photogram- metric Engineerrng d Remote Sensing, 6, 835-841.

Tibbo, H.R. (1992). Abstracting across the disciplines: a content analysis of abstracts from the natural sciences and the humanities with implications for abstracting standards and online information retrieval. Library & Information Science Research, 14, 31-56.

United States Geological Survey. (1990). Spatial data transfer standard. Washmgton, DC: U.S. Department of the Interior, U.S. Geological Survey, National Mapping Division.

van den Doel, E. (1992). Evaluation of the information on Gee-Databases in the NexpRl informationbank. In Proceedmgs of EGIS ‘92 (vol. 1, pp. 599-605). Munich, Germany: EGIS Foundation.

Vrana, R. (1992). Design and operation of a Meta-GlS. In Proceedings of the GIS ‘92 Symposium (pp. l-l 1). Vancouver, British Columbia, Canada: Forestry Canada.

Walker, D., Newman, I., Ruggles, C., & Medyckyj-Scott, D. (1992). Tiling over space: a formal method for iden- tifying spatial constramts in information retrieval. In Proceedings ofEG1.S ‘92 (vol. 2, pp. 1614-1623). Munich, Germany: EGIS Foundation.

Walsh, B. (1991). Western Australia’s state land informatton directory. URISA Journal, 4(l), 91-92. Weibel, S. (1992). Automated cataloging: implications for libraries and patrons. In F.W. Lancaster & Linda C.

Smith (Eds.), Artificial intelligence and expert systems: will they change the hbrary? (pp. 67-80). Urbana- Champaign: University of Illinois.

Weider, C., Fullton, J., & Spero, S. (1993). Architecture of the Whois++ index serwce (draft report) [machine- readable data file]. Internet Engineering Task Force (Producer & Distributor).

Wiggins, R. (1993). The University of Minnesota’s Internet gopher system: a tool for accessing network-based electronic information. Publrc-Access Computer Systems Revrew, 4(2), 4-60.

Wright, R.A., & Lee, M. (1992). Addressing data standards: the northwest land information system network. In A.I. Johnson, C.B. Petersson & J.L. Fulton (Eds.), Geographic information systems (GIS) and mapping- practices and standards, (pp. 71-75). Philadelphia: ASTM.

Yannakoudakis, E.J., Ayres, F.H., & Huggill, J.A.W. (1990). Matching of cttations between non-standard data- bases. Journal of the American Society for Information Saence, 41(8), 599-610.