+>KMIBK>JCD?,>M> :MKN?MNKALkclarke/Geography232/PeuckerChrisman.pdfDelivered by Publishing...

15
Delivered by Publishing Technology to: University of California Santa Barbara Library IP: 128.111.234.232 on: Tue, 14 May 2013 23:14:44 Copyright (c) Cartography and Geographic Information Society. All rights reserved. Cartographic Data Structures Thomas K. Peucker and Nicholas Chrisman ABSTRACT. Efficient and flexible data structures are important to the development of computer mapping. Most current data banks are characterized by 1) structures which' are convenient at the input stage rather than at the stages of use within computer programs, 2) separate and uncoordinated files for different types of geographic fea- hires, and 3) a lack of information about neighboring entities. The term "neighbor- hood function" may be used to indicate the relative location of a geographic entity and is a concept which is involved in all three of these characteristics. Ongoing re- search on data structures had led to work on the GEOGRAF system for encoding planar data and the GDS ("Geographic Data Structure") for encoding three-dimen- sional surfaces. Both involve data manipulation between the digitizing stage and the actual use of the data within computer mapping programs. I, 1975, pp. 55-69 SS INTRODUCTION A series of ongoing research projects are concerned with efficient and flexible data structures for geographic and carto- graphic analysis. The three main points of concern in the research can be sum- marized as follows: 1) In most cartographic data banks, the arrangement of the data is guided by the input stage. In other words, little ma- nipulation of the data is performed after the data have been input into the system from maps. 2) Cartographers and computer scien- tists have made few attempts to combine different types of cartographic information, for example, height with other cartographic features. Therefore, the different types of cartographic entities are stored in different files and it is usually extremely time-con- suming to combine them. 3) The data structure is usually very simple and lacks one facet in particular which is essential for much geographic and Thomas K. Peucker is associate professor, Simon Fraser University, Burnaby, Canada. Nicholas Chrisman is senior programmer, Lab- oratory for Computer Graphics and Spatial Anal- ysis, Harvard University, Cambridge, Massachu- setts. The research for this paper has been supported by ONR contract NOOOI4-73-C-OI09 and the Laboratory for Computer Graphics and Spatial Analysis, Harvard University. The America1~ Cartographer, Vol. 2, No. cartographic analysis-an indication of the relative location of a geographic entity, i.e., the position of a geographic entity with respect to its neighboring entities. These three points may be abbreviated with the terms flexibility, comparability, and topology. This paper will characterize types of existing geographic and carto- graphic data systems for planar and three- dimensional surfaces, especially with re- spect to these three points. The paper will also describe attempts which have been made by the authors to produce data sys- tems which eliminate some of the problems of existing ones. The term "neighborhood function" will playa major role throughout the paper and will therefore be explained in more detail in the following section. NEIGHBORHOOD FUNCTION When asked for the location of a city, we will give the location with respect to a river, a seacoast, a pass, a neighboring larger city, or other feature. Rarely will we use the geographic coordinates of longi- tude or latitude, nor will we use map co- ordinates. We are taught in elementary geography that the geographic coordinates wiII tell us little about either the large-scale (site) or small-scale (situation) char- acteristics of a place. Similarly, if I de-

Transcript of +>KMIBK>JCD?,>M> :MKN?MNKALkclarke/Geography232/PeuckerChrisman.pdfDelivered by Publishing...

Page 1: +>KMIBK>JCD?,>M> :MKN?MNKALkclarke/Geography232/PeuckerChrisman.pdfDelivered by Publishing Technology to: University of California Santa Barbara Library IP: 128.111.234.232 on: Tue,

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

Cartographic Data StructuresThomas K Peucker and Nicholas Chrisman

ABSTRACT Efficient and flexible data structures are important to the development ofcomputer mapping Most current data banks are characterized by 1) structures whichare convenient at the input stage rather than at the stages of use within computerprograms 2) separate and uncoordinated files for different types of geographic fea-hires and 3) a lack of information about neighboring entities The term neighbor-hood function may be used to indicate the relative location of a geographic entityand is a concept which is involved in all three of these characteristics Ongoing re-search on data structures had led to work on the GEOGRAF system for encodingplanar data and the GDS (Geographic Data Structure) for encoding three-dimen-sional surfaces Both involve data manipulation between the digitizing stage and theactual use of the data within computer mapping programs

I 1975 pp 55-69

SS

INTRODUCTIONA series of ongoing research projects

are concerned with efficient and flexibledata structures for geographic and carto-graphic analysis The three main pointsof concern in the research can be sum-marized as follows

1) In most cartographic data banks thearrangement of the data is guided by theinput stage In other words little ma-nipulation of the data is performed afterthe data have been input into the systemfrom maps

2) Cartographers and computer scien-tists have made few attempts to combinedifferent types of cartographic informationfor example height with other cartographicfeatures Therefore the different types ofcartographic entities are stored in differentfiles and it is usually extremely time-con-suming to combine them

3) The data structure is usually verysimple and lacks one facet in particularwhich is essential for much geographic and

Thomas K Peucker is associate professorSimon Fraser University Burnaby CanadaNicholas Chrisman is senior programmer Lab-oratory for Computer Graphics and Spatial Anal-ysis Harvard University Cambridge Massachu-setts The research for this paper has beensupported by ONR contract NOOOI4-73-C-OI09and the Laboratory for Computer Graphics andSpatial Analysis Harvard UniversityThe America1~ Cartographer Vol 2 No

cartographic analysis-an indication of therelative location of a geographic entityie the position of a geographic entity withrespect to its neighboring entities

These three points may be abbreviatedwith the terms flexibility comparabilityand topology This paper will characterizetypes of existing geographic and carto-graphic data systems for planar and three-dimensional surfaces especially with re-spect to these three points The paper willalso describe attempts which have beenmade by the authors to produce data sys-tems which eliminate some of the problemsof existing ones The term neighborhoodfunction will playa major role throughoutthe paper and will therefore be explained inmore detail in the following section

NEIGHBORHOOD FUNCTION

When asked for the location of a city wewill give the location with respect to ariver a seacoast a pass a neighboringlarger city or other feature Rarely willwe use the geographic coordinates of longi-tude or latitude nor will we use map co-ordinates We are taught in elementarygeography that the geographic coordinateswiII tell us little about either the large-scale(site) or small-scale (situation) char-acteristics of a place Similarly if I de-

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

scribe my position on a piece of terrain Iwill not use my map to determine my lo-cation within the UTM-grid rather I willlook for nearby relief features (peaksrivers slopes roads) as orientation char-acteristics

In contrast when a geographic data bankof any kind is created it utilizes some kindof absolute coordinate system Usuallyneither the geographic evaluation at thetime of digitizing- nor the mapping systemused with the data allow the inclusion ofsuch cartographic features as streets riversand roads which would give us an indica-tion of relative location

While the human user can be aided inhis orientation on a map through overlaysor map comparison the computer has dif-ficulty in determining relative location Ifthe relative location in terms of the closestpoints for each of say 5000 points is tobe calculated the program literally has tocompute every points distance to everyother point Indeed some widely-usedprograms do this computation several timeswithin one program run

Some indication of the relative locationof a geographic feature can be very usefulThis neighborhood relationship will be re-ferred to as a neighborhood functionIt can be expressed in different ways asan explicit or implicit function or as a dis-crete function in the form of a table

The exPlicit function can be a polynomialor trigonometric equation set for a discretegrid of surface patches vhich give the formof the surface at each point within thepatch Typical for this approach is thework of Junkins et al (1973) Two-dimensional spline functions also fall intothis category (Holroyd and Bhattacharyya]970) A much more frequent way of de-fining a neighborhood is by the explicitfunction in the form of a sort routine whichfinds the closest neighbors This is donein various interpolation algorithms to pro-duce a regular grid of points (Shepard1968 Heiskanen and Moritz 1967) Thecomputations increase close to the squareof the increase in the number of pointssince the search has to be repeated forevery point and all points or at least a

56

large number of them must be processedeach time

This search procedure also applies inthe case of planar surfaces where neigh-boring polygons must be found For ex-ample when contiguity constraints are im-posed in problems of factorial ecology andother regional correlations all polygonpoints must be searched to find those whichare in common for a pair of polygonsAgain the problem increases in complexityaccording to the square of the number ofthe items being searched

The imPlicit function expressing neigh-borhood relationships is usually a functionthat describes the coding structure of thegeographic entities One very good caseis described in Rosenfeld (1969) for dif-ferent types of neighborhood relationshipswithin a regular grid in which the pointPij has the four neighbors (i + 1 j) (i j +1) (i-l j) (i j-l) and it has the eightneighbors (i+l j) (i+l j+l) (i j+l)(i-I j+l) (i-I j) (i-I j-l) (i j-1) (i+l j-I)

Neighborhood relationships in the formof tables are very rarely used This typerecords the neighborhood function bypointers indicating neighboring geo-graphic entities For example a structurewhich is built on the basis of Thiessenpolygons could have such a structure bysimply having the labels of the neighboringpoints accompany the record of each pointThe most widely-known structure of thistype is the DIME file of the US Censuswhich encodes line segments the names ofthe polygons to the left and right of eachline segment and the names of the twonodes at either end The neighborhoodrelationships used in the DIME develop-ment are derived from the discipline oftopology (Cooke and Maxfield 1967)

Neighborhood relationships will be dis-cltssed in greater detail in the analysis ofvarious existing data structures Twotypes of geographic data bases will be dis-cussed those defining planar surfaces andthose defining three-dimensional surfacesFor both types a summary of their his-torical development will be presented andit will be shown that although presently

The American CartograPher

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

at difIerent stages of clevelopl11llltthesetwo types can be treated as special casesof one topological data structure

DATA STRUCTURES FORPLANAR SURFACESTypes of Structures

The types of geographic entities onplanar surfaces are points lines and area-enclosing lines or polygons The latter areperhaps the most frequently encoded fea-ture in geographic data systems

The simplest data base system for planarsurfaces is that of encoding entity by entitywith little or no regard for entity overlapsor adjacencies (Fig 1) In other wordsevery polygon in a polygon system is en-coded and stored without any regard forcontiguous polygons and lines are encodedwithout regard for the fact that they mayintersect or merge with other lines Theresults of such an encoding are sliverlines (duplication of lines in slightly dif-ferent positions) These sliver lines areconfusing and unaesthetic and hence it isvirtually impossible to do anything directlywith such a data base except an extremelycoarse graphic image

To go beyond the use of such data forthe production of coarse images editingmust be performed This alternative hasbeen attempted in several cases one beingthe MAP-MODEL system (Arms 1970)The editing in this system is guided by theassumption that every segment has to berepresented twice except for segments onthe outer boundary For each segmentthe editing program sorts through all re-maining segments to find its complement(the identical line of the neighboring poly-gon) Those segments for which it hasnot found a complement are tagged to in-dicate potential errors to the user

To overcome some of the limitations ofindependently encoded entities systemshave been developed based on a commonlocation dictionary This dictionary con-tains the coordinates of every boundarypoint on the map Polygon boundary listsare then compiled which consist of thelabels (location numbers) of these bound-ary points (Fig 2) Line information can

fol 2 No1 April 1975

LOCATION LIST

POLYGON 1 X1 YI

x2 Yl2

Xln Ytn

POLYGON 2 X21 Y21

119 22 Y22

Fig 1 The simplest encoding is areal en-tity by areal entity In this system mostpoints are recorded twice and some (egP14 P2 2 Pa 7) three times A point doesnot nec~ssarily have identical coordinates inall recordings

be handled in the same way Programsbased on this structure include CAL-FORM from the Laboratory of ComputerGraphics and Spatial Analysis Otherprograms have subroutines to convert thispoint dictionary structure to the simpleentity-by-entity line list described aboveThe data canmiddot then be used in programssuch as SYMAP (Laboratory for Com-puter Graphics and Spatial Analysis)which are compatible with the entity-byentity structure Programs have also beendeveloped to simplify data input throughautomated polygon identification (Douglas1973)

57

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

POLYGON

3 125 242322214344454647483029 28 27261

5 4443 424140 393837 3635 44

Fig 2 In the second type of polygon sys-tem the individual points are encoded onlyonce and are stored in the Point DictionaryFor every polygon a Polygon Boundary Listis then established

The point dictionary data base has theadvantage that sliver lines do not occurHowever the problem of neighborhoodrelationships is not handled any better thanwith the entity-by-entity approach Thesearch for common lines is no longer ac-cording to the coordinates of points but bytheir labels this brings us closer to asolution only by a little less computer timeIt also creates difficulties A point diction-ary can and will be accessed in an arbitraryorder since there are no restrictions re-garding point placement The standard

1

POINT DICTIONARYPOINT

1 x y2 x y3 x y

48 x y

response to this problem is to make thedictionary core resident Unfortunatelythis will limit the complexity of the mapthat can be handled in this manner sinceall points must be stored throughout theoperation of the program This short-coming of such sharing of data is aug-mented by the continued independence ofthe entities created by the dictionary in-stead of n points with their r and y co-ordinates there are simply n referencesto points

Some of the objections to the ordinarypoint dictionary approach can be eliminatedby formulating an intermediate object be-tween the entity and the points used as anaddressing scheme (Nake and Peucker1972 Peucker (ed) 1973) A geographicentity can be created from a list of line seg-mentso these segments are in turn createdfrom references to the point dictionaryThis system allows for easy definition ofthe entities with a minimum of pointersbut each entity is still independent in thesense that its neighbors are not knownThe direction of access is still from entityto location but not the reverse

All of the data structures described thusfar are of limited flexibility and utilitybecause neighborhood relationships are notknown By adding the topological neigh-borhood function of each element to a datastructure large improvements in flexibilityand scope of applications can be realized

If one is concerned about the memorycapacity which is needed for the storage ofexplicit neighborhood relationships onemight consider a system with implicitneighborhood functions by modifying theentity form in the encoding stage Manyexisting geographic information systemsstore land-use data in grids of rectangularcells (Hsu 1975) However a seriousproblem is encountered with such a regulardiscrete encoding of planar surfaces Ac-cording to the sampling theorem thesampling interval has to be half the size ofthe smallest features to be encoded (egTobler 1969) Hence either the size ofthe cells has to be very small to enable en-coding of detailed variations (eg urbanland uses) or a grosser cell size can be

58 The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

CENSUS ADDRESS CODING GUIDE RECORDS

DIME STREET SEGMENT RECORDS

STREET NODE NODE TRACT BLOCK TRACT BLOCK LOW HIGHSTART END LEFT LEFT RIGHT RIGHT ADDR ADD~

Main 5 6 105 104 1 10Main 6 7 106 103 11 17Main 7 8 101 103 19 28Malt 8 9 101 102 30 42

Fig 3 The DIME-file base Representedis a portion of Tract 1 in a hypothetical city(Cooke and Maxfield 1967) The smallnumbers at middotthe street intersections are thenodes the larger numbers on Main Streetare the addresses Two types of records arccreated as illustrated in the tables

]D-_2 10_3__ ~

]I 105 JD[4 GROVE STREET 3

HIGH ADDRESS

4228

1091741

LOW ADDRESS

3012211119

28

419~0[~

8 ELM STREET 11

BLOCK

102103104105106101

~TRACT ~~

19

CHURCH

101

MainMain

MainMainMainMain

STREET TRACT

system for the Address Coding effort ofthe 1970 census

The basic element of the DIME file is aline segment defined by two end pointsIt is assumed that the segment is straightand not crossed by any other line Themetropolitan files usually define this unitas a street block face Complex lines arerepresented by a series of segments ap-proximating the line The segment hastwo node identifiers along with the co-ordinates of its two end points and codesfor the polygon on each side of the seg-ment

While DIME topology makes much 111-

Structures Using ExplicitTopological Relationships

One of the first known attempts to in-corporate explicit topological structure intoa geographic data base is the Dual Inde-pendent Map Encoding (DIME) systemof the US Bureau of the Census (Fig 3)The DIME files were originally developedas an automated topological error detection

used for encoding only the very slowlychanging variations (generalized landuse) In the case of the small units a verylarge volume of redundant information iscreated in an area of uniform land use andmatrix reduction techniques such as run-length encoding (recording in each rowonly those cells differing from an immedi-ate]y previous cell) create only physicaland not logical compaction (Amidon andAkin 1970) In the case of the grossercell size highly varying land use featuresare aggregated to a degree which reducesthe usefulness of the whole system

Beyond sampling problems a grid struc-ture imposes a bias toward specific orienta-tions of features Diagonals althoughphysically longer are given the apparentcell relationships of unit distances alongthe major axes The grid structure alsocreates the illusion of working with a dis-crete point space rather than a regularareal partitioning

Recent studies have developed param-eters which determine the degree of in-accuracy of a given sampling mesh(Switzer 1975) In cases where sometypes of geographic features vary oversmall areas and others over very largeareas such as the case of statewide land-use patterns the error is averaged over allfeatures and will affect features with ahigh-frequency variation more than others

The variation of the mesh size within theinformation system would be of only littlehelp The data management and the ma-nipulation and display routines would be-come more complex although probably lessthan many researchers imagine Exactfigures about the differences cannot begiven since no literature about such a sys-tem is known to the authors

jmiddotol 2 No1 April 1975 59

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

formation accessible to urban researchersneighborhood relationships are not madeexplicit Segments sharing a node forexample must be found by laborious searchprocedures Search is also required toassemble the outline of a polygon Moreimportantly the DIME structure is cum-hersome to use for many cartographic ap-plications involving areas made up of com-plex lines For procedures in a one-timechecking effort as is the case of AddressCoding and Address Matching in metro-politan areas it is quite adequate Forefficient computer storage and retrievaland for many applications however im-provements are desirable For examplethe reliance on the individual line segmentmakes the reduction of detail for displaypurposes difficult since line segments can-not be simply deleted without correctingthe reference codes for the affected nodes

At the Laboratory for Computer Graph-ics and Spatial Analysis the junior authorhas developed a data structure POLY-VRT that is designed to contain all theinformation needed to construct any of thepreviously enumerated planar structuresThe basic object of POLYVRT is thechain Like a DIME segment a chainhas nodes at its two ends separates twoareas and is assumed to be uncrossed Itdiffers in that the POLYVRT chain maybe made up of many points whereas thebasic DIME unit has only two points Aboundary between two polygons can bereferenced by a single chain no matterhow complicated because line detail istopologically unimportant (Laboratory forComputer Graphics and Spatial Analysis1974)

The coding of a complicated boundaryas a unit is not unique to POLYVRTThe data bank used in the project TheInteractive Map in Urban Research(Nake and Peucker 1972 Peucker (ed)1973) as well as the World Data Bank I(Schmidt 1969) are composed of linesIn the latter case some of them containover 4000 points The chain based systemof POLYVRT however is a differenttype of structure because of the topologicalrole assigned to the chain and the subse-

60

l[ucnt construction of a list data strnctureBased upon this assignment the topologi-cal information about a chain resembles theinformation on a DIME record exceptthat the distinction between nodes (iepoints used for more than one chain) andthe points internal to a chain allows in-ternal points to be eliminated without influencing the neighborhood relationshipsThe main innovation in developing a chainrepresentation is that areas of significantline detail may be efficiently handledTopological checking is reduced from de-pendency on the number of points to de-pendency on the number of boundaries

In addition to the indication of the rela-tive location of the chain with respect toits neighboring polygons POL YVRT in-formation is stored in separate lists assem-bling the bounding chains for every poly-gon Thus searches can take place in twodirections from the chain to the polygonand from the polygon to the chain Thisis very important for any type of neighbor-hood manipulation since neighboring enti-ties can be found through their boundingor bounded complements In otherwords to follow along a group of chainsone flips through chain to polygon to thenext chain etc whereas to traverse aseries of polygons one tests for adjacentpolygons by going through the chain di-rectory for each polygon

The POLYVRT program places pointinformation in secondary storage Thethree higher level objects (chains nodesand polygons) are core resident Onlythe chain refers directly to the point filein the secondary storage In addition toindicating the locations of the points in thepoint file the chain record incorporates thename of the chain the labels of the startingand ending nodes and the left and rightpolYJ~middotons Conversely the polygon listconsists of the bounding chains in propersequence (Figs 4 and 5) A list linkinsnodes to chains could easily be constructed

DATA STRUCTURES FORTHREE-DIMENSION AL SURFACES

Boehm (1967) describes in a very de-tailed analysis the advantages and disad-

The American Carfographrr

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

2 POLYGON 22 NODE

888 CHAIN 3 POINT WITHINCHAIN

Fig 4 The external representation of thePOLYVRT chain-file Every chain has aname the number of inner points (length)the two limiting nodes the two boundarypolygons and a series of coordinates for thepoints NODES

prisingly this topic has produced little dis-cussion despite the fact that extremelylarge data banks of terrain (digital terrainmodels) have been developed

When encoding surfaces it is necessaryto adapt the density of points to the varia-tion in the local terrain The question ofhow dense the points have to be can beanswered by again using the philosophy ofthe sampling theorem For the terrainwithin a typical map the variation canchange considerably resulting in a needfor frequent adjustments of the samplinginterval

In a contour map the density of contourlines changes with the density of reliefvariation Therefore it fulfills the re-quirements set by the sampling theoremfor a non-stationary surface ie a sur-face with changing terrain For the regu-lar grid on the other hand if the smallestobject one wishes to detect anywherewithin a study area is of size (wavelength) 5 then the grid spacing every-where must be 52 or less (note for ex-ample Mark 1974) The regular gridagain tends toward redundancy sincesmooth areas within the study area willcontain far more points than are needed toaccurately portray their form To improvethe resolution of a grid by a factor oft the grid spacing must be decreased by

44 11 3

TO LEFT RIGHT

22 77 0

5

18

x1ylmiddotmiddotmiddotmiddotmiddotXSy5

4 55 77 2

11

111

999

11

Name J

Fig 5 The internal representation of thePOLYVRT chain-file Reference to poly-gon chains are positive or negative accord-ing to the direction of the chain used

vantages of different ways of encoding sur-faces He comes to the conclusion that theencoding of surfaces by contours minimizesthe storage capacity whereas a regulargrid of surface points minimizes the com-puting time necessary for several types ofmanipulations Boehm did not include inhis study a data system with an irregulardistribution of points but only contour en-codings and regular grid structures of con-stant and variable mesh width He didnot reflect on the reasons why contoureddata minimize storage capacity whereas aregular grid minimizes computing time Ifhe had done so it is quite possible that thedevelopment of geographic data baseswould have taken different routes Sur-

22 X Y

Y

77 Y

CHAINSName Poinls from to Ltlll Right

888 22 55

222 55 77

77

POINTS

Secondary Storagex y detail

POLYGONSUst Point Namemiddotm

al 2 No1 April 1975 61

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

the reciprocal of this factor and the totalnumber of points is increased by p

The question has been raised many timesin photogral11t11etrywhether contours arethe best representatives of topographicsurfaces It has been stated that contoursdo not detect many types of breaks whichare frequent on terrain (eg Brandstatter1957) Therefore the encoding of sur-faces by vertical profiles has beenattemptedmiddot many times in photogrammetryin recent years Points are encoded onlywhen the slope of the surface changes(eg Kraus 1973) In another case thebreak lines are encoded in addition to aregular grid (Sima 1972) and in anotherapproach only the break lines or only theridges and channel lines of a surface areencoded (Grist 1972) The amount ofdetail with these approaches depends onthe scale used

vVhen performing numerical computa-tions on the basis of these digital terrainmodels the quantity of data involved willbe only one determining factor for theamount of programming and computationsneeded 11ost of the numerical computa-tions on surfaces require some type ofneighborhood function either to computesome surface behavior such as slope orlocal variation of relief or to find the nextunit for the drawing of a contour or avertical profile For a set of contours itis relatively easy to create a directorywhich indicates a sequence of contours ina type of tree in which the surroundingcontour is the base and the other contoursare the branches (Morse 1968) To findthe neighboring points on the two ad-jacent contours for a given point on anintermediate contour however one mustsearch through all the points on the twoadjacent contours and compute the dis-tances to the point in question This pro-cedure is time-consuming if the contourlines are represented by very many pointsA regular grid on the other hand has animplicit neighborhood function and findinga neighbor does not involve search norextra computer time For a set of irregu-larly-distributed points as they are repre-sented in very simple data structures (eg

62

SYMAP) the creation of a neighborhoodfunction is usually performed by findingthe closest small number of points wherethe number varies around six

THE GENERAL CASE

I t has been noted that although planesurfaces and three-dimensional surfaceslook rather different it takes only a fewassumptions to treat one as the subset ofthe other While three-dimensional sur-faces are always based on interval or ratiodata planar surfaces often involve ordinaland nominal data However it has beenshown that ordinal and nominal data maybe treated as interval data (N ordbeck andRystedt 1970 Rosenfeld 1969) and evenwithout this conversion we can combinethe two types into one general case bysimply using different assumptions aboutneighborhood

Given a set of n data points for whichx y and z coordinates are known con-tinuously defined surfaces (ie surfacesfor which one and only one value existsat every point) can be created using anyof three different assumptions about thesurface behavior (Peucker 1972) Thefirst assumption is that of a stepped sur-face which says that the surface retainsthe value of a data point within the neigh-borhood of that data point where neigh-borhood is defined either by a given poly-gon (the choropleth approach) or by thefact that the area is closer to one datapoint than to any other one in the neighbor-hood (the proximal approach) The sec-ond assumption is that each data pointrepresents a sample of a single value on aconstantly changing surface Neighbor-hood is then a number of closest neighbor-ing data points and intervening values areinterpolated with different types of in-terpolation procedures The third assump-tion is that the data point is a sample froma constantly changing surface that maycontain errors thus the data point is notnecessarily located on the surface butclose to it This approach implies thefurther assumption that the actual surfaceis smoother than the surface constructedthrough the sample data points

The American CartograPher

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

linFeatures

ChainGroup

vVith these assumptions about surfacebehavior any point or areal distributionof a variable can be treated as a continuousfunction z = f (x y) and planar and three-dimensional surfaces can be combined intoone type This has already been done insome computer progranls the most notablebeing SYMAP Many cartographic ap-plications must treat both types of sur-faces and therefore data structures mustbe developed which can handle both typesat one time

THE PROPOSED DATA STRUCTURES

Planar Surfaces

The need to incorporate different typesor hierarchies of polygons uncovered oneof the limiting assumptions made inPOLYVRT A chain plays a dual rolefirst it is the boundary of two areal enti-ties and secondly it is the unbroken unitof point retrieval This is not a problemif one is limited to a single nonoverlappingset of polygons The following proposednew system to bear the name GEOGRAFis an attempt at greater flexibility

Because of the addition of many layersof complexity involving multiple polygonsets the chain cannot remain to the samedegree the controlling object of the datastructure Just as the notion of an un-broken line is important so is the notionof an unpartitioned space In a systemwhich must handle overlapping polygonnetworks there is a need for a root ob-ject which is defined as an area uncut byany further partitioning This object istermed the Least Common GeographicUnit (LCGU) The LCGUs are con-structed as a POLYVRT polygon directlyfrom chains The relationship of theLCGDs to all other polygon types is hi-erarchical (Fig 6)

In turn the existence of the LCGUallows for the creation of each class ofpolygon In order to allow simple codingof the boundary relationships at thesehigher levels in the structure the chaingroup was devised A chain group is aset of chains which form a boundary of twoareal units for a given polygon classThese polygons are constructed of chain

0 2 No1 APril 1975

Cha DGroup === LiI I

EJ 8ha0ENodes -- -- LCGU 0

- - -LJ ~ 1

RFig 6 The relationship between the vari-ous parts of the data structure

groups which in turn are constructedfrom chains

Line features can be built up of chainsin the same manner as chain groupsNote that the chain group listing for eachlevel of polygon and each listing for a linefeature reference only the chains them-selves This allows each system to beconsidered as a separate directory whichis core-resident only when that class ofobjects is retrieved

The LCGU has other implications andapplications that are useful because of thetopological data structure The LCGUwith its coding for each of the polygonsets can be combined with contiguity in-formation of linear feature types to producean Attribute Cross Reference (A CR) The ACR is a table in which all objects(in polygon and linear systems) are cross-referenced to each other to determinenesting By using attributes of chains(lengths) and LCGUs (areas populationdensities etc) this cross-referencing cap-ability can assign a string of data collectedfor one polygon type to a string of asecond type

Topological manipulation routines arecentral to the success of this structureThe intersection of geographic featureswill rely on topological knowledge to re-alize economies of scale in processing largefiles All operations with lines will actu-ally work with bands built with endpoints

63

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of the line and the furthest deviants toboth sides (if the bands become too widethe lines are split etc) With thisapproach the windowing process uses non-linear windows (which is often the casewith geographic coordinates and map pro-jections) and becomes quite elegant Sim-ilarly intersection procedures such aspoint-in-polygon line-across-polygon andpolygon-over-polygon determination allowfor gainful application of the topologicalprinciple For point-in-polygon searcheschain groups are constructed which bisectthe universe into parts to sort the points(or nodes of polygons) into three groupsleft within the band and right Only thesecond group needs more detailed treat-ment The point set is then recursivelypartitioned until it reaches the level of theLeGDs For line-oriented problems thetopological connections of the two sets com-pared would allow intersections to belimited to immediate neighbors

Another important procedure will be anested chain-intersection routine Hereagain the chain band and its recursivesegmentation is used The number ofpoints which define a chain is constantlyincreased until the intersection test can bedetermined The search of a line througha set of polygons will use a graph-searchalgorithm developed for the GDS (Geo-graphic Data Structure) project discussedin the next section The neighborhoodsearch routines create records of neighborsfor every point or line or polygon at anyspecified depth of neighborhood

Three-Dimensional Surfaces

To implement the ideas presented herethe senior author is developing a geo-graphic information system for three-dimensional surfaces Both systems thethree-dimensional and the planar arebased on data structures with explicittopological neighborhood relationships

The basic philosophy of both approachesis to separate the data base from the appli-cation programs In the early days ofcomputer cartography a data set wasgenerally tied to the application programthat used it Now the available data have

64

become so voluminous and the applicationprograms so varied that extra efforts inthe preparation of data for more efficientcomputations seem to be justified We arewell in step with modern computer scienceto separate the data base from the applica-tion programs with the data structure be-coming the link between the two

The creation of a structured data baseis nothing new The interpolation of anirregular grid to a regular grid of height-points provides a data structure throughthe implicit neighborhood function of thegrid And in the case where polygons areindependently defined by a series of pointsthe neighborhood relationship is replacedby a search algorithm which finds theneighbors for every polygon by searchingfor matching segments in the boundaryfiles of the other polygons The idea is tospend computing time before any applica-tion has been performed in the anticipa-tion of heavy uses of the data base Anefficient structure of the data base shouldnot only speed up computations consider-ably but should also simplify the produc-tion of application programs

The data structure developed under theworking title Geographic Data Structure(GDS) is based on irregularly distributedpoints which are assumed to be samplepoints without sampling errors from asingle-valued surface Two types of struc-tures form the core of the data bases Thefirst creates neighborhood relationships bytriangulating the data set and storing forevery point the labels of all points whichare linked with the point by a triangleedge (Fig 7) The second structure isproduced by selecting those points of thesurfaces which lie along lines of high in-formation content such as ridges andchannel lines and defining them by theirnodes which are peaks passes and pits(Fig 8) This second data structureserves two purposes First ~t is a generalrepresentation of the surface for roughcomputations second it is a directoryinto the more detailed first structure

In the first structure the creation of theneighborhood relationship is based on theassumption that data-sets are usually of

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

34 16 3S 53 52 33 15

11 4 18 36 35 16 38

CHAINS

POINTE~

NODES

Fig 8 The GDS-second data structureBoth the node-file and the chain-file haveaccess to the first data structure the node-file directly and the chain-file through achain-painter-file

Gottschalk 1970) creates all possible linkschooses the shortest and eliminates alllinks which intersect the shortest Thisprocedure is repeated with the next short-est links until no links intersect The re-sult is the set of links with the minimumcumulative distance between neighboringpoints

The procedure has one disadvantagesince (~ ) links have to be created the num-ber of points is therefore limited toonly several hundred The first step inour approach therefore limits the links toa number of potential neighbors amongwhich the shortest link is chosen and in-tersected with all other links originatingin these potential neighbors This pro-cedure limits the number of tests for in-

nm2

tersections of links to less than -4-where m is the number of potential neigh-bors an arbitrary number between 8 and14 depending on the density variation ofthese points The procedure does notguarantee however that only triangles areconstructed polygons with more thanthree sides can result although they arerelatively rare The check for such poly-gons and their elimination is very easy andfast

The second possibility is to create atriangulated structure through use ofThiessen polygons A published solution(Rhynsburger 1973) intersects for everypoint the links to every other point midwayand chooses the smallest polygon created

~COIDJ-OIIIJI II POINTS POINTERS CHAIN POINTERSL Nm bull

[[]]-DI ~~u =

317 35 34 1516

Points Pcinters

POINTS POINTERSName x y bull Pointers

r--1735

~ i34

6 X Y z -- 15 -~ 17 X Y z i+6 23

18 X Y z i+13

~

-bull183635- 16

3 X Y z j 3

~

~l

----

NEIGHBORHOOD RELATIONSHIPS

TRIANGULATED POINTS

Fig 7 The GDS-first data structure Theillustration shows the points on the surfacewith their links to neighbors (edges) Theexternal representation is shown by theneighborhood relationships The internalrepresentation is composed of the point-fileand the pointer-file

two types (a) sets of irregularly-distri-buted points which were digitized with theunderstanding that every point is signif-icant and (b) sets of regularly or irregu-larly-distributed points where it is knownthat a number of points are redundant andcan be eliminated from the set eg regulargrids of points and encoded contours

The first type of data set is linked bysome type of triangulation At least twoapproaches exist The first (Dueppe and

Vol 2 No1 April 1975 65

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

by the perpendiculars Every point whichcontributes to the Thiessen polygon is aThiessen neighbor This procedure canagain be simplified by the assumption of alimited set of potential neighbors Thesame checking routines as above have tobe applied

Another approach which limits the num-ber of necessary tests but is mathemat-ically correct at the same time has beendeveloped within the project by KurtBrassel at Harvard University The pro-cedure is based on fields of potentialneighbors which converge very rapidly

The alternative to triangulation is three-dimensional generalization ie to selectfrom a set of points those which define thestructure with the least deviation from theoriginal surface The basic concept de-veloped by Randolph Franklin of HarvardUniversity and T K Peucker is to ap-proximate the surface of a series of tri-angles through a selected set of pointswhere each additional point included intothe set is the one which deviates the mostfrom the approximated triangles until thedeviations are below a given value (seePeucker 1974)

In both cases the triangulation and gen-eralization of the surface the result is alinked list of surface points The termlinked list means that points are linked withone another through pointers In otherwords a point is not only identified by its y z coordinates but also by a list of thelabels of the points which form edges oftriangles with the point In our case eachrecord consists of the t y z coordinates ofa point and a reference to the start of thepointers to the neighbors in a pointer listThe reason for not having the neighbor-hood pointers with the point record is thatthe number of points varies considerably(Mark 1974) Since the record has to belong enough to include all possible num-bers of neighbors large parts of the pointersections would be empty most of the timeThe pointers are sorted starting with thepointer the least East of North of a point(Fig 7)

The use of this type of data-structure isvery simple and efficient For every search

66

(profile contour etc) a criterion for edge-intersection is developed For contouringfor example the critical question iswhether one point of the edge is above thecontour level and the other below A startis found and one point of the edge is con-sidered a reference point and the other asubpoint The next subpoint is found bylooking up the next neighbor in the pointerlist If the test is positive the intersectionis performed and the process repeated Ifthe test is negative the reference and sub-points are switched and the process re-peated

Other procedures are equally simpleTo find a triangle for example one hasonly to have a reference and a subpointThe third point is the next label in thepointer list of the reference point after thesubpoint To find all triangles one goesthrough the total pointer list leaving outall those edges connecting reference pointswith subpoints with a smaller label sincethey would create triangles which hadbeen treated when the subpoint was areference point

Although the first data structure aspresented above seems to be efficient interms of storage capacity it does not pro-vide easy access to the data base which isoften very large I t is for this reasonthat we are developing the second datastructure to represent the general structureof the surface and to serve as a directoryto reach into the first data base (Pfaltz1975) (Fig 8)

The first step in the creation of thesecond data structure is to find the ridgechannel and break lines on the surface Ifone labels the highest point for every tri-angle the unlabeled points are membersof the channel line at least on smoothsurfaces A subsequent search routinedeals with the irregularities Points alongridges are found by eliminating the lowestpoint of every triangle using a routinedeveloped for a regular grid by D MDouglas The detection of break lines issomewhat more difficult

Some theoretical studies of surfaces byWarntz (1966) show that ridge lines andchannel lines cross at passes a useful point

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of information relative to the developmentof the second data structure Practicalconsiderations suggest that in terrain andother surfaces this regularity is not alwayspresent

Once the topological structure is at handit can be used as a directory into the firststructure The line is therefore treatedas a chain similar to the chain of thePOLYVRT system The nodes of thechain are the peaks passes pits and otherendpoints of chains on the surfaces Thesepoints are stored with their coordinatesand the names of the chains which termi-nate at the nodes The chains are storedwith the labels of the nodes and pointersinto the chain lists which consist of labelsof points in the first data structure (notethat here the chain structure differs fromthat of POLYVRT)

A third component of the GeographicData Structure should be mentioned sinceit illustrates very well the logical adapta-tion of a computer problem solution togeographical data The problem at handis the partitioning of the data set Sincewith large data sets only portions can bekept in fast memory the data base is seg-mented into pages which are broughtinto memory as units For the Geo-graphic Data Structure the paging sys-tem can solve several problems inherentin a complex geographic information sys-tem

The boundaries of patches as we callthe areal extension of a page are chainsalready defined for the second structureSince detail along the chain is of no topo-logical interest the density of points alongthe chain can differ for its two sides Inother words the density of triangles canchange from patch to patch This allowsfor very efficient data encoding even interrain with sudden changes in the sur-face behavior as at the change from amountainous area into a plain (Peucker1972)

Another advantage of the paging-systemis the ease of including topographic andplanar information Linking point lineand areal data to the triangulated pointswould lead to high definitional redundancy

Vol 2 No1 APril 1975

The secondary structure could lead lttoam-biguities where the terrain is very elon-gated Since an attempt has been made tokeep the shape of the patches as compactas possible the combination of non-terraindata with patch boundaries seems to bemost appropriate

Since the patch boundaries are againchains another virtue comes to light Thepatches can be treated as polygons of thePOLYVRT system with little difficultyThis link between the two systems lendshope that eventually they may be merged

It is an appropriate question to ask whatsuch a data structure as the GDS will beable to accomplish A number of displayroutines have already been developed(Cochrane 1974) and a series of proce-dures for surface analysis based on heuris-tic searches are underway (Fig 9) Sinceboth levels of data structure are graphswe will be able to rely on many of the de-velopments connected with operations re-search specifically network analysis forthe manipulative treatment of the data

As both systems GDS and GEOGRAFhave topological structures it is possibleto merge the two The creation of poly-gons from points is the major link fromthe GDS project to GEOGRAF Thecreation of a set of centroids for polygonsallows the conversion in the opposite direc-tion This way surfaces can be treated aspolygonal sets and can be displayed andmanipulated by the routines of GEO-GRAF Conversely polygonal data canbe treated as surfaces for GDS Theneighborhood routines are what make theproject useful in quantitative geographyand planning Neighborhood searches areextremely expensive without the topologi-cal data structure but they are usually amost important part of urban and environ-mental analyses once a general overview isobtained from the data

Although basic research and applicationdevelopment are two sides of one coin andmust go together Ito obtain lasting resultsthis paper has concentrated on the theo-retical parts of the project since their de-velopment is ahead of the application rou-tines a fact which should be expected

67

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

REGULAR GRID

~GENERALIZATION

TRIANGULATEDIRREGULAR GRID

First Data Structure

jNETWORK PROCEDURES

THEORy OFSURFACE SPECIFICPOINTS AND LINES

HIURfSTIC SEARCH

SEARCH

Profile SearchGeographic DisorientationOptimum SearchNeighborhood An~lysis

ContouringBlock DiagramShaded ContoursInclined ContoursVisibility TestRadar Image

Fig 9 The problem flow for GDS from the data base via the extraction of points andchains for the two types of data structures to their application

The quintessence of the research so far isthe hypothesis that topologically-structureddata bases of three-dimensional and planarsurfaces can result in reduced efforts inthe development and execution of applka-

REFERENCESAmidon E L and G S Aiken (1970) Al-

gorithmic Selection of the Best Method forCompressing Map Data Strings Communica-tions of the Association for Computillg Ma-chinery Vol 14 pp 769-774

68

tion routines We have some indicationthat the hypothesis is correct the real testwill come when the bulk of the applicationroutines is completed

Arms A (1970) MAPMODEL System Sys-tem Descriptioll and Users Guide Bureau ofGovernmental Research and Service Univer-sity of Oregon 59 pp

Boehm B M (1967) Tabular Representation

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of Multivariate Functions with Applicationsto Topographic Modelling Proceedings Z2ndNational Conference Association for Comput-ing Machinery pp 403-415

Brandstaetter L (1957) Exakte Schichtlinienund Topographische GelaendedarstellungOesterreichische Zeitschrift fuer VermesSlmgs-wesepound Sonderheft 18 Wien 90 pp

Cochrane Douglas (1974) Cartographic Dis-play Operations on Surface Data Collected inan Irregular Structure MA thesis SimonFraser University

Cooke D F and W F Maxfield (1967) TheDevelopment of a Geographic Base File andIts Uses for Mapping Urban and RegionalInformation Systems for Social Programs Pa-pers from the 5th Annual Conference of theUrban and Regional Information System As-sociation pp 207-218

Douglas David H (1973) BNDRYNETPeucker T K (ed) The Interactive Map inUrban Research Final Report After Year OneUniversity of British Columbia

Dueppe R D and H J Gottschalk (1970)Automatische Interpolation von Isolinien beiwillkuerlich verteilten Stuetzpunkten Allge-meine Vermessungs-Nachrichten Vol 129 No10 (Oct) pp 423-426

Grist M W (1972) Digital Ground ModelsAn Account of Recent Research Photogram-metric Record Vol 70 No4 (Oct) pp 424-441

Heiskanen W A and H Moritz (1967) Phys-ical Geodesy W H Freeman San FranciscoChapter 7

Holroyd M T and B K Bhattacharyya(1970) Automatic CO1poundtouring of GeoPhysicalData Using Riclbic Spline Interpolation Geo-logical Survey of Canada Paper No 70-55

Hsu M L et al (1975) Computer Applica-tions in Land Use Mapping and the MinnesotaLand Management Information System Da-vis ] C and M McCullagh (eds) Displayand Analysis of Spatial Data New York JohnWiley and Sons

Junkins ] L G M Miller and J R Jancaitis(1973) A Weighting Function Approach toModelling of Irregular Surfaces Journal ofGeophysical Research Vol 78 No 11 (Apr)pp 1794-1803

Kraus K (1973) A General Digital TerrainModel translation from an article in Acker-mann F (1973) Numerische Photogram-metrie Sammlung Wiechmann Neue FolgeBuchreihe Bd 5 Karlsruhe

Laboratory for Computer Graphics and Spatial

Vol 2 No1 April 1975

Analysis Harvard University (1974) POLY-VRT Manual Cambridge Mass

Mark David M (1974) A Comparison ofComputer-based Terrain Storage MethodsWith Respect to Evaluation of Certain Geo-morphometric Measures MA thesis Uni-versity of British Columbia

Morse S P (1968) A Mathematical Modelfor the Analysis of Contouring Line DataJournal of the Association for Computing Ma-chinery Vol 15 No2 pp 205-220

Nake F and T K Peucker (1972) The Inter-active Map in Urban Research Report afterYear One University of British Columbia

Nordbeck S and B Rystedt (1970) IsarithmicMaps and the Continuity of Reference IntervalFunctions Geografiska Annaler Vol 52 SerB pp 92-123

Peucker T K (1972) Computer CartographyAssociation of American Geographers CollegeGeography Commission Resource Paper No17 Washington DC

Peucker T K (ed) (1973) The InteractiveMap in Urban Research Final Report AfterYear One University of British Columbia

Peucker T K (1974) GeograPhical DataStructltres Report After Year One SimonFraser University

Pfaltz J L (1975) Surface Networks Geo-graPhical Analysis (forthcoming)

Rhynsburger D (1973) Analytic Delineationof Thiessen Polygons Geographical AnalysisVol 5 No2 (Apr) pp 133-144

Rosenfeld A (1969) Picture Processing byComputer New York Academic Press

Schmidt W (1969) The Automap SystemSurveying and Mapping Vol 29 No 1(Mar) pp 101-106

Shepard D (1968) A Two-Dimensional Inter-polation Function for Irregularly SpacedData Proceedings 23rd National ConferenceAssociation for Computing Machinery pp 517-524

Sima J (1972) Prinzipien des CS digitalenGelaendemodells Vermessungstechnik Vol20 No2 pp 48-51

Switzer P (1975) in Sampling of PlanarSurfaces Davis J C and M McCullagh(eds) DisPlay and Analysis of Spatial DataJohn Wiley and Sons New York

Tobler W R (1969) Geographical Filters andtheir Inverses GeograPhical Analysis Vol 1pp 234-253

Warntz W (1966) The Topology of Socio-Economic Terrain and Spatial Flows Papersof the Regional Science Association Vol 17pp47-61 bull

69

Page 2: +>KMIBK>JCD?,>M> :MKN?MNKALkclarke/Geography232/PeuckerChrisman.pdfDelivered by Publishing Technology to: University of California Santa Barbara Library IP: 128.111.234.232 on: Tue,

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

scribe my position on a piece of terrain Iwill not use my map to determine my lo-cation within the UTM-grid rather I willlook for nearby relief features (peaksrivers slopes roads) as orientation char-acteristics

In contrast when a geographic data bankof any kind is created it utilizes some kindof absolute coordinate system Usuallyneither the geographic evaluation at thetime of digitizing- nor the mapping systemused with the data allow the inclusion ofsuch cartographic features as streets riversand roads which would give us an indica-tion of relative location

While the human user can be aided inhis orientation on a map through overlaysor map comparison the computer has dif-ficulty in determining relative location Ifthe relative location in terms of the closestpoints for each of say 5000 points is tobe calculated the program literally has tocompute every points distance to everyother point Indeed some widely-usedprograms do this computation several timeswithin one program run

Some indication of the relative locationof a geographic feature can be very usefulThis neighborhood relationship will be re-ferred to as a neighborhood functionIt can be expressed in different ways asan explicit or implicit function or as a dis-crete function in the form of a table

The exPlicit function can be a polynomialor trigonometric equation set for a discretegrid of surface patches vhich give the formof the surface at each point within thepatch Typical for this approach is thework of Junkins et al (1973) Two-dimensional spline functions also fall intothis category (Holroyd and Bhattacharyya]970) A much more frequent way of de-fining a neighborhood is by the explicitfunction in the form of a sort routine whichfinds the closest neighbors This is donein various interpolation algorithms to pro-duce a regular grid of points (Shepard1968 Heiskanen and Moritz 1967) Thecomputations increase close to the squareof the increase in the number of pointssince the search has to be repeated forevery point and all points or at least a

56

large number of them must be processedeach time

This search procedure also applies inthe case of planar surfaces where neigh-boring polygons must be found For ex-ample when contiguity constraints are im-posed in problems of factorial ecology andother regional correlations all polygonpoints must be searched to find those whichare in common for a pair of polygonsAgain the problem increases in complexityaccording to the square of the number ofthe items being searched

The imPlicit function expressing neigh-borhood relationships is usually a functionthat describes the coding structure of thegeographic entities One very good caseis described in Rosenfeld (1969) for dif-ferent types of neighborhood relationshipswithin a regular grid in which the pointPij has the four neighbors (i + 1 j) (i j +1) (i-l j) (i j-l) and it has the eightneighbors (i+l j) (i+l j+l) (i j+l)(i-I j+l) (i-I j) (i-I j-l) (i j-1) (i+l j-I)

Neighborhood relationships in the formof tables are very rarely used This typerecords the neighborhood function bypointers indicating neighboring geo-graphic entities For example a structurewhich is built on the basis of Thiessenpolygons could have such a structure bysimply having the labels of the neighboringpoints accompany the record of each pointThe most widely-known structure of thistype is the DIME file of the US Censuswhich encodes line segments the names ofthe polygons to the left and right of eachline segment and the names of the twonodes at either end The neighborhoodrelationships used in the DIME develop-ment are derived from the discipline oftopology (Cooke and Maxfield 1967)

Neighborhood relationships will be dis-cltssed in greater detail in the analysis ofvarious existing data structures Twotypes of geographic data bases will be dis-cussed those defining planar surfaces andthose defining three-dimensional surfacesFor both types a summary of their his-torical development will be presented andit will be shown that although presently

The American CartograPher

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

at difIerent stages of clevelopl11llltthesetwo types can be treated as special casesof one topological data structure

DATA STRUCTURES FORPLANAR SURFACESTypes of Structures

The types of geographic entities onplanar surfaces are points lines and area-enclosing lines or polygons The latter areperhaps the most frequently encoded fea-ture in geographic data systems

The simplest data base system for planarsurfaces is that of encoding entity by entitywith little or no regard for entity overlapsor adjacencies (Fig 1) In other wordsevery polygon in a polygon system is en-coded and stored without any regard forcontiguous polygons and lines are encodedwithout regard for the fact that they mayintersect or merge with other lines Theresults of such an encoding are sliverlines (duplication of lines in slightly dif-ferent positions) These sliver lines areconfusing and unaesthetic and hence it isvirtually impossible to do anything directlywith such a data base except an extremelycoarse graphic image

To go beyond the use of such data forthe production of coarse images editingmust be performed This alternative hasbeen attempted in several cases one beingthe MAP-MODEL system (Arms 1970)The editing in this system is guided by theassumption that every segment has to berepresented twice except for segments onthe outer boundary For each segmentthe editing program sorts through all re-maining segments to find its complement(the identical line of the neighboring poly-gon) Those segments for which it hasnot found a complement are tagged to in-dicate potential errors to the user

To overcome some of the limitations ofindependently encoded entities systemshave been developed based on a commonlocation dictionary This dictionary con-tains the coordinates of every boundarypoint on the map Polygon boundary listsare then compiled which consist of thelabels (location numbers) of these bound-ary points (Fig 2) Line information can

fol 2 No1 April 1975

LOCATION LIST

POLYGON 1 X1 YI

x2 Yl2

Xln Ytn

POLYGON 2 X21 Y21

119 22 Y22

Fig 1 The simplest encoding is areal en-tity by areal entity In this system mostpoints are recorded twice and some (egP14 P2 2 Pa 7) three times A point doesnot nec~ssarily have identical coordinates inall recordings

be handled in the same way Programsbased on this structure include CAL-FORM from the Laboratory of ComputerGraphics and Spatial Analysis Otherprograms have subroutines to convert thispoint dictionary structure to the simpleentity-by-entity line list described aboveThe data canmiddot then be used in programssuch as SYMAP (Laboratory for Com-puter Graphics and Spatial Analysis)which are compatible with the entity-byentity structure Programs have also beendeveloped to simplify data input throughautomated polygon identification (Douglas1973)

57

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

POLYGON

3 125 242322214344454647483029 28 27261

5 4443 424140 393837 3635 44

Fig 2 In the second type of polygon sys-tem the individual points are encoded onlyonce and are stored in the Point DictionaryFor every polygon a Polygon Boundary Listis then established

The point dictionary data base has theadvantage that sliver lines do not occurHowever the problem of neighborhoodrelationships is not handled any better thanwith the entity-by-entity approach Thesearch for common lines is no longer ac-cording to the coordinates of points but bytheir labels this brings us closer to asolution only by a little less computer timeIt also creates difficulties A point diction-ary can and will be accessed in an arbitraryorder since there are no restrictions re-garding point placement The standard

1

POINT DICTIONARYPOINT

1 x y2 x y3 x y

48 x y

response to this problem is to make thedictionary core resident Unfortunatelythis will limit the complexity of the mapthat can be handled in this manner sinceall points must be stored throughout theoperation of the program This short-coming of such sharing of data is aug-mented by the continued independence ofthe entities created by the dictionary in-stead of n points with their r and y co-ordinates there are simply n referencesto points

Some of the objections to the ordinarypoint dictionary approach can be eliminatedby formulating an intermediate object be-tween the entity and the points used as anaddressing scheme (Nake and Peucker1972 Peucker (ed) 1973) A geographicentity can be created from a list of line seg-mentso these segments are in turn createdfrom references to the point dictionaryThis system allows for easy definition ofthe entities with a minimum of pointersbut each entity is still independent in thesense that its neighbors are not knownThe direction of access is still from entityto location but not the reverse

All of the data structures described thusfar are of limited flexibility and utilitybecause neighborhood relationships are notknown By adding the topological neigh-borhood function of each element to a datastructure large improvements in flexibilityand scope of applications can be realized

If one is concerned about the memorycapacity which is needed for the storage ofexplicit neighborhood relationships onemight consider a system with implicitneighborhood functions by modifying theentity form in the encoding stage Manyexisting geographic information systemsstore land-use data in grids of rectangularcells (Hsu 1975) However a seriousproblem is encountered with such a regulardiscrete encoding of planar surfaces Ac-cording to the sampling theorem thesampling interval has to be half the size ofthe smallest features to be encoded (egTobler 1969) Hence either the size ofthe cells has to be very small to enable en-coding of detailed variations (eg urbanland uses) or a grosser cell size can be

58 The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

CENSUS ADDRESS CODING GUIDE RECORDS

DIME STREET SEGMENT RECORDS

STREET NODE NODE TRACT BLOCK TRACT BLOCK LOW HIGHSTART END LEFT LEFT RIGHT RIGHT ADDR ADD~

Main 5 6 105 104 1 10Main 6 7 106 103 11 17Main 7 8 101 103 19 28Malt 8 9 101 102 30 42

Fig 3 The DIME-file base Representedis a portion of Tract 1 in a hypothetical city(Cooke and Maxfield 1967) The smallnumbers at middotthe street intersections are thenodes the larger numbers on Main Streetare the addresses Two types of records arccreated as illustrated in the tables

]D-_2 10_3__ ~

]I 105 JD[4 GROVE STREET 3

HIGH ADDRESS

4228

1091741

LOW ADDRESS

3012211119

28

419~0[~

8 ELM STREET 11

BLOCK

102103104105106101

~TRACT ~~

19

CHURCH

101

MainMain

MainMainMainMain

STREET TRACT

system for the Address Coding effort ofthe 1970 census

The basic element of the DIME file is aline segment defined by two end pointsIt is assumed that the segment is straightand not crossed by any other line Themetropolitan files usually define this unitas a street block face Complex lines arerepresented by a series of segments ap-proximating the line The segment hastwo node identifiers along with the co-ordinates of its two end points and codesfor the polygon on each side of the seg-ment

While DIME topology makes much 111-

Structures Using ExplicitTopological Relationships

One of the first known attempts to in-corporate explicit topological structure intoa geographic data base is the Dual Inde-pendent Map Encoding (DIME) systemof the US Bureau of the Census (Fig 3)The DIME files were originally developedas an automated topological error detection

used for encoding only the very slowlychanging variations (generalized landuse) In the case of the small units a verylarge volume of redundant information iscreated in an area of uniform land use andmatrix reduction techniques such as run-length encoding (recording in each rowonly those cells differing from an immedi-ate]y previous cell) create only physicaland not logical compaction (Amidon andAkin 1970) In the case of the grossercell size highly varying land use featuresare aggregated to a degree which reducesthe usefulness of the whole system

Beyond sampling problems a grid struc-ture imposes a bias toward specific orienta-tions of features Diagonals althoughphysically longer are given the apparentcell relationships of unit distances alongthe major axes The grid structure alsocreates the illusion of working with a dis-crete point space rather than a regularareal partitioning

Recent studies have developed param-eters which determine the degree of in-accuracy of a given sampling mesh(Switzer 1975) In cases where sometypes of geographic features vary oversmall areas and others over very largeareas such as the case of statewide land-use patterns the error is averaged over allfeatures and will affect features with ahigh-frequency variation more than others

The variation of the mesh size within theinformation system would be of only littlehelp The data management and the ma-nipulation and display routines would be-come more complex although probably lessthan many researchers imagine Exactfigures about the differences cannot begiven since no literature about such a sys-tem is known to the authors

jmiddotol 2 No1 April 1975 59

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

formation accessible to urban researchersneighborhood relationships are not madeexplicit Segments sharing a node forexample must be found by laborious searchprocedures Search is also required toassemble the outline of a polygon Moreimportantly the DIME structure is cum-hersome to use for many cartographic ap-plications involving areas made up of com-plex lines For procedures in a one-timechecking effort as is the case of AddressCoding and Address Matching in metro-politan areas it is quite adequate Forefficient computer storage and retrievaland for many applications however im-provements are desirable For examplethe reliance on the individual line segmentmakes the reduction of detail for displaypurposes difficult since line segments can-not be simply deleted without correctingthe reference codes for the affected nodes

At the Laboratory for Computer Graph-ics and Spatial Analysis the junior authorhas developed a data structure POLY-VRT that is designed to contain all theinformation needed to construct any of thepreviously enumerated planar structuresThe basic object of POLYVRT is thechain Like a DIME segment a chainhas nodes at its two ends separates twoareas and is assumed to be uncrossed Itdiffers in that the POLYVRT chain maybe made up of many points whereas thebasic DIME unit has only two points Aboundary between two polygons can bereferenced by a single chain no matterhow complicated because line detail istopologically unimportant (Laboratory forComputer Graphics and Spatial Analysis1974)

The coding of a complicated boundaryas a unit is not unique to POLYVRTThe data bank used in the project TheInteractive Map in Urban Research(Nake and Peucker 1972 Peucker (ed)1973) as well as the World Data Bank I(Schmidt 1969) are composed of linesIn the latter case some of them containover 4000 points The chain based systemof POLYVRT however is a differenttype of structure because of the topologicalrole assigned to the chain and the subse-

60

l[ucnt construction of a list data strnctureBased upon this assignment the topologi-cal information about a chain resembles theinformation on a DIME record exceptthat the distinction between nodes (iepoints used for more than one chain) andthe points internal to a chain allows in-ternal points to be eliminated without influencing the neighborhood relationshipsThe main innovation in developing a chainrepresentation is that areas of significantline detail may be efficiently handledTopological checking is reduced from de-pendency on the number of points to de-pendency on the number of boundaries

In addition to the indication of the rela-tive location of the chain with respect toits neighboring polygons POL YVRT in-formation is stored in separate lists assem-bling the bounding chains for every poly-gon Thus searches can take place in twodirections from the chain to the polygonand from the polygon to the chain Thisis very important for any type of neighbor-hood manipulation since neighboring enti-ties can be found through their boundingor bounded complements In otherwords to follow along a group of chainsone flips through chain to polygon to thenext chain etc whereas to traverse aseries of polygons one tests for adjacentpolygons by going through the chain di-rectory for each polygon

The POLYVRT program places pointinformation in secondary storage Thethree higher level objects (chains nodesand polygons) are core resident Onlythe chain refers directly to the point filein the secondary storage In addition toindicating the locations of the points in thepoint file the chain record incorporates thename of the chain the labels of the startingand ending nodes and the left and rightpolYJ~middotons Conversely the polygon listconsists of the bounding chains in propersequence (Figs 4 and 5) A list linkinsnodes to chains could easily be constructed

DATA STRUCTURES FORTHREE-DIMENSION AL SURFACES

Boehm (1967) describes in a very de-tailed analysis the advantages and disad-

The American Carfographrr

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

2 POLYGON 22 NODE

888 CHAIN 3 POINT WITHINCHAIN

Fig 4 The external representation of thePOLYVRT chain-file Every chain has aname the number of inner points (length)the two limiting nodes the two boundarypolygons and a series of coordinates for thepoints NODES

prisingly this topic has produced little dis-cussion despite the fact that extremelylarge data banks of terrain (digital terrainmodels) have been developed

When encoding surfaces it is necessaryto adapt the density of points to the varia-tion in the local terrain The question ofhow dense the points have to be can beanswered by again using the philosophy ofthe sampling theorem For the terrainwithin a typical map the variation canchange considerably resulting in a needfor frequent adjustments of the samplinginterval

In a contour map the density of contourlines changes with the density of reliefvariation Therefore it fulfills the re-quirements set by the sampling theoremfor a non-stationary surface ie a sur-face with changing terrain For the regu-lar grid on the other hand if the smallestobject one wishes to detect anywherewithin a study area is of size (wavelength) 5 then the grid spacing every-where must be 52 or less (note for ex-ample Mark 1974) The regular gridagain tends toward redundancy sincesmooth areas within the study area willcontain far more points than are needed toaccurately portray their form To improvethe resolution of a grid by a factor oft the grid spacing must be decreased by

44 11 3

TO LEFT RIGHT

22 77 0

5

18

x1ylmiddotmiddotmiddotmiddotmiddotXSy5

4 55 77 2

11

111

999

11

Name J

Fig 5 The internal representation of thePOLYVRT chain-file Reference to poly-gon chains are positive or negative accord-ing to the direction of the chain used

vantages of different ways of encoding sur-faces He comes to the conclusion that theencoding of surfaces by contours minimizesthe storage capacity whereas a regulargrid of surface points minimizes the com-puting time necessary for several types ofmanipulations Boehm did not include inhis study a data system with an irregulardistribution of points but only contour en-codings and regular grid structures of con-stant and variable mesh width He didnot reflect on the reasons why contoureddata minimize storage capacity whereas aregular grid minimizes computing time Ifhe had done so it is quite possible that thedevelopment of geographic data baseswould have taken different routes Sur-

22 X Y

Y

77 Y

CHAINSName Poinls from to Ltlll Right

888 22 55

222 55 77

77

POINTS

Secondary Storagex y detail

POLYGONSUst Point Namemiddotm

al 2 No1 April 1975 61

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

the reciprocal of this factor and the totalnumber of points is increased by p

The question has been raised many timesin photogral11t11etrywhether contours arethe best representatives of topographicsurfaces It has been stated that contoursdo not detect many types of breaks whichare frequent on terrain (eg Brandstatter1957) Therefore the encoding of sur-faces by vertical profiles has beenattemptedmiddot many times in photogrammetryin recent years Points are encoded onlywhen the slope of the surface changes(eg Kraus 1973) In another case thebreak lines are encoded in addition to aregular grid (Sima 1972) and in anotherapproach only the break lines or only theridges and channel lines of a surface areencoded (Grist 1972) The amount ofdetail with these approaches depends onthe scale used

vVhen performing numerical computa-tions on the basis of these digital terrainmodels the quantity of data involved willbe only one determining factor for theamount of programming and computationsneeded 11ost of the numerical computa-tions on surfaces require some type ofneighborhood function either to computesome surface behavior such as slope orlocal variation of relief or to find the nextunit for the drawing of a contour or avertical profile For a set of contours itis relatively easy to create a directorywhich indicates a sequence of contours ina type of tree in which the surroundingcontour is the base and the other contoursare the branches (Morse 1968) To findthe neighboring points on the two ad-jacent contours for a given point on anintermediate contour however one mustsearch through all the points on the twoadjacent contours and compute the dis-tances to the point in question This pro-cedure is time-consuming if the contourlines are represented by very many pointsA regular grid on the other hand has animplicit neighborhood function and findinga neighbor does not involve search norextra computer time For a set of irregu-larly-distributed points as they are repre-sented in very simple data structures (eg

62

SYMAP) the creation of a neighborhoodfunction is usually performed by findingthe closest small number of points wherethe number varies around six

THE GENERAL CASE

I t has been noted that although planesurfaces and three-dimensional surfaceslook rather different it takes only a fewassumptions to treat one as the subset ofthe other While three-dimensional sur-faces are always based on interval or ratiodata planar surfaces often involve ordinaland nominal data However it has beenshown that ordinal and nominal data maybe treated as interval data (N ordbeck andRystedt 1970 Rosenfeld 1969) and evenwithout this conversion we can combinethe two types into one general case bysimply using different assumptions aboutneighborhood

Given a set of n data points for whichx y and z coordinates are known con-tinuously defined surfaces (ie surfacesfor which one and only one value existsat every point) can be created using anyof three different assumptions about thesurface behavior (Peucker 1972) Thefirst assumption is that of a stepped sur-face which says that the surface retainsthe value of a data point within the neigh-borhood of that data point where neigh-borhood is defined either by a given poly-gon (the choropleth approach) or by thefact that the area is closer to one datapoint than to any other one in the neighbor-hood (the proximal approach) The sec-ond assumption is that each data pointrepresents a sample of a single value on aconstantly changing surface Neighbor-hood is then a number of closest neighbor-ing data points and intervening values areinterpolated with different types of in-terpolation procedures The third assump-tion is that the data point is a sample froma constantly changing surface that maycontain errors thus the data point is notnecessarily located on the surface butclose to it This approach implies thefurther assumption that the actual surfaceis smoother than the surface constructedthrough the sample data points

The American CartograPher

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

linFeatures

ChainGroup

vVith these assumptions about surfacebehavior any point or areal distributionof a variable can be treated as a continuousfunction z = f (x y) and planar and three-dimensional surfaces can be combined intoone type This has already been done insome computer progranls the most notablebeing SYMAP Many cartographic ap-plications must treat both types of sur-faces and therefore data structures mustbe developed which can handle both typesat one time

THE PROPOSED DATA STRUCTURES

Planar Surfaces

The need to incorporate different typesor hierarchies of polygons uncovered oneof the limiting assumptions made inPOLYVRT A chain plays a dual rolefirst it is the boundary of two areal enti-ties and secondly it is the unbroken unitof point retrieval This is not a problemif one is limited to a single nonoverlappingset of polygons The following proposednew system to bear the name GEOGRAFis an attempt at greater flexibility

Because of the addition of many layersof complexity involving multiple polygonsets the chain cannot remain to the samedegree the controlling object of the datastructure Just as the notion of an un-broken line is important so is the notionof an unpartitioned space In a systemwhich must handle overlapping polygonnetworks there is a need for a root ob-ject which is defined as an area uncut byany further partitioning This object istermed the Least Common GeographicUnit (LCGU) The LCGUs are con-structed as a POLYVRT polygon directlyfrom chains The relationship of theLCGDs to all other polygon types is hi-erarchical (Fig 6)

In turn the existence of the LCGUallows for the creation of each class ofpolygon In order to allow simple codingof the boundary relationships at thesehigher levels in the structure the chaingroup was devised A chain group is aset of chains which form a boundary of twoareal units for a given polygon classThese polygons are constructed of chain

0 2 No1 APril 1975

Cha DGroup === LiI I

EJ 8ha0ENodes -- -- LCGU 0

- - -LJ ~ 1

RFig 6 The relationship between the vari-ous parts of the data structure

groups which in turn are constructedfrom chains

Line features can be built up of chainsin the same manner as chain groupsNote that the chain group listing for eachlevel of polygon and each listing for a linefeature reference only the chains them-selves This allows each system to beconsidered as a separate directory whichis core-resident only when that class ofobjects is retrieved

The LCGU has other implications andapplications that are useful because of thetopological data structure The LCGUwith its coding for each of the polygonsets can be combined with contiguity in-formation of linear feature types to producean Attribute Cross Reference (A CR) The ACR is a table in which all objects(in polygon and linear systems) are cross-referenced to each other to determinenesting By using attributes of chains(lengths) and LCGUs (areas populationdensities etc) this cross-referencing cap-ability can assign a string of data collectedfor one polygon type to a string of asecond type

Topological manipulation routines arecentral to the success of this structureThe intersection of geographic featureswill rely on topological knowledge to re-alize economies of scale in processing largefiles All operations with lines will actu-ally work with bands built with endpoints

63

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of the line and the furthest deviants toboth sides (if the bands become too widethe lines are split etc) With thisapproach the windowing process uses non-linear windows (which is often the casewith geographic coordinates and map pro-jections) and becomes quite elegant Sim-ilarly intersection procedures such aspoint-in-polygon line-across-polygon andpolygon-over-polygon determination allowfor gainful application of the topologicalprinciple For point-in-polygon searcheschain groups are constructed which bisectthe universe into parts to sort the points(or nodes of polygons) into three groupsleft within the band and right Only thesecond group needs more detailed treat-ment The point set is then recursivelypartitioned until it reaches the level of theLeGDs For line-oriented problems thetopological connections of the two sets com-pared would allow intersections to belimited to immediate neighbors

Another important procedure will be anested chain-intersection routine Hereagain the chain band and its recursivesegmentation is used The number ofpoints which define a chain is constantlyincreased until the intersection test can bedetermined The search of a line througha set of polygons will use a graph-searchalgorithm developed for the GDS (Geo-graphic Data Structure) project discussedin the next section The neighborhoodsearch routines create records of neighborsfor every point or line or polygon at anyspecified depth of neighborhood

Three-Dimensional Surfaces

To implement the ideas presented herethe senior author is developing a geo-graphic information system for three-dimensional surfaces Both systems thethree-dimensional and the planar arebased on data structures with explicittopological neighborhood relationships

The basic philosophy of both approachesis to separate the data base from the appli-cation programs In the early days ofcomputer cartography a data set wasgenerally tied to the application programthat used it Now the available data have

64

become so voluminous and the applicationprograms so varied that extra efforts inthe preparation of data for more efficientcomputations seem to be justified We arewell in step with modern computer scienceto separate the data base from the applica-tion programs with the data structure be-coming the link between the two

The creation of a structured data baseis nothing new The interpolation of anirregular grid to a regular grid of height-points provides a data structure throughthe implicit neighborhood function of thegrid And in the case where polygons areindependently defined by a series of pointsthe neighborhood relationship is replacedby a search algorithm which finds theneighbors for every polygon by searchingfor matching segments in the boundaryfiles of the other polygons The idea is tospend computing time before any applica-tion has been performed in the anticipa-tion of heavy uses of the data base Anefficient structure of the data base shouldnot only speed up computations consider-ably but should also simplify the produc-tion of application programs

The data structure developed under theworking title Geographic Data Structure(GDS) is based on irregularly distributedpoints which are assumed to be samplepoints without sampling errors from asingle-valued surface Two types of struc-tures form the core of the data bases Thefirst creates neighborhood relationships bytriangulating the data set and storing forevery point the labels of all points whichare linked with the point by a triangleedge (Fig 7) The second structure isproduced by selecting those points of thesurfaces which lie along lines of high in-formation content such as ridges andchannel lines and defining them by theirnodes which are peaks passes and pits(Fig 8) This second data structureserves two purposes First ~t is a generalrepresentation of the surface for roughcomputations second it is a directoryinto the more detailed first structure

In the first structure the creation of theneighborhood relationship is based on theassumption that data-sets are usually of

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

34 16 3S 53 52 33 15

11 4 18 36 35 16 38

CHAINS

POINTE~

NODES

Fig 8 The GDS-second data structureBoth the node-file and the chain-file haveaccess to the first data structure the node-file directly and the chain-file through achain-painter-file

Gottschalk 1970) creates all possible linkschooses the shortest and eliminates alllinks which intersect the shortest Thisprocedure is repeated with the next short-est links until no links intersect The re-sult is the set of links with the minimumcumulative distance between neighboringpoints

The procedure has one disadvantagesince (~ ) links have to be created the num-ber of points is therefore limited toonly several hundred The first step inour approach therefore limits the links toa number of potential neighbors amongwhich the shortest link is chosen and in-tersected with all other links originatingin these potential neighbors This pro-cedure limits the number of tests for in-

nm2

tersections of links to less than -4-where m is the number of potential neigh-bors an arbitrary number between 8 and14 depending on the density variation ofthese points The procedure does notguarantee however that only triangles areconstructed polygons with more thanthree sides can result although they arerelatively rare The check for such poly-gons and their elimination is very easy andfast

The second possibility is to create atriangulated structure through use ofThiessen polygons A published solution(Rhynsburger 1973) intersects for everypoint the links to every other point midwayand chooses the smallest polygon created

~COIDJ-OIIIJI II POINTS POINTERS CHAIN POINTERSL Nm bull

[[]]-DI ~~u =

317 35 34 1516

Points Pcinters

POINTS POINTERSName x y bull Pointers

r--1735

~ i34

6 X Y z -- 15 -~ 17 X Y z i+6 23

18 X Y z i+13

~

-bull183635- 16

3 X Y z j 3

~

~l

----

NEIGHBORHOOD RELATIONSHIPS

TRIANGULATED POINTS

Fig 7 The GDS-first data structure Theillustration shows the points on the surfacewith their links to neighbors (edges) Theexternal representation is shown by theneighborhood relationships The internalrepresentation is composed of the point-fileand the pointer-file

two types (a) sets of irregularly-distri-buted points which were digitized with theunderstanding that every point is signif-icant and (b) sets of regularly or irregu-larly-distributed points where it is knownthat a number of points are redundant andcan be eliminated from the set eg regulargrids of points and encoded contours

The first type of data set is linked bysome type of triangulation At least twoapproaches exist The first (Dueppe and

Vol 2 No1 April 1975 65

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

by the perpendiculars Every point whichcontributes to the Thiessen polygon is aThiessen neighbor This procedure canagain be simplified by the assumption of alimited set of potential neighbors Thesame checking routines as above have tobe applied

Another approach which limits the num-ber of necessary tests but is mathemat-ically correct at the same time has beendeveloped within the project by KurtBrassel at Harvard University The pro-cedure is based on fields of potentialneighbors which converge very rapidly

The alternative to triangulation is three-dimensional generalization ie to selectfrom a set of points those which define thestructure with the least deviation from theoriginal surface The basic concept de-veloped by Randolph Franklin of HarvardUniversity and T K Peucker is to ap-proximate the surface of a series of tri-angles through a selected set of pointswhere each additional point included intothe set is the one which deviates the mostfrom the approximated triangles until thedeviations are below a given value (seePeucker 1974)

In both cases the triangulation and gen-eralization of the surface the result is alinked list of surface points The termlinked list means that points are linked withone another through pointers In otherwords a point is not only identified by its y z coordinates but also by a list of thelabels of the points which form edges oftriangles with the point In our case eachrecord consists of the t y z coordinates ofa point and a reference to the start of thepointers to the neighbors in a pointer listThe reason for not having the neighbor-hood pointers with the point record is thatthe number of points varies considerably(Mark 1974) Since the record has to belong enough to include all possible num-bers of neighbors large parts of the pointersections would be empty most of the timeThe pointers are sorted starting with thepointer the least East of North of a point(Fig 7)

The use of this type of data-structure isvery simple and efficient For every search

66

(profile contour etc) a criterion for edge-intersection is developed For contouringfor example the critical question iswhether one point of the edge is above thecontour level and the other below A startis found and one point of the edge is con-sidered a reference point and the other asubpoint The next subpoint is found bylooking up the next neighbor in the pointerlist If the test is positive the intersectionis performed and the process repeated Ifthe test is negative the reference and sub-points are switched and the process re-peated

Other procedures are equally simpleTo find a triangle for example one hasonly to have a reference and a subpointThe third point is the next label in thepointer list of the reference point after thesubpoint To find all triangles one goesthrough the total pointer list leaving outall those edges connecting reference pointswith subpoints with a smaller label sincethey would create triangles which hadbeen treated when the subpoint was areference point

Although the first data structure aspresented above seems to be efficient interms of storage capacity it does not pro-vide easy access to the data base which isoften very large I t is for this reasonthat we are developing the second datastructure to represent the general structureof the surface and to serve as a directoryto reach into the first data base (Pfaltz1975) (Fig 8)

The first step in the creation of thesecond data structure is to find the ridgechannel and break lines on the surface Ifone labels the highest point for every tri-angle the unlabeled points are membersof the channel line at least on smoothsurfaces A subsequent search routinedeals with the irregularities Points alongridges are found by eliminating the lowestpoint of every triangle using a routinedeveloped for a regular grid by D MDouglas The detection of break lines issomewhat more difficult

Some theoretical studies of surfaces byWarntz (1966) show that ridge lines andchannel lines cross at passes a useful point

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of information relative to the developmentof the second data structure Practicalconsiderations suggest that in terrain andother surfaces this regularity is not alwayspresent

Once the topological structure is at handit can be used as a directory into the firststructure The line is therefore treatedas a chain similar to the chain of thePOLYVRT system The nodes of thechain are the peaks passes pits and otherendpoints of chains on the surfaces Thesepoints are stored with their coordinatesand the names of the chains which termi-nate at the nodes The chains are storedwith the labels of the nodes and pointersinto the chain lists which consist of labelsof points in the first data structure (notethat here the chain structure differs fromthat of POLYVRT)

A third component of the GeographicData Structure should be mentioned sinceit illustrates very well the logical adapta-tion of a computer problem solution togeographical data The problem at handis the partitioning of the data set Sincewith large data sets only portions can bekept in fast memory the data base is seg-mented into pages which are broughtinto memory as units For the Geo-graphic Data Structure the paging sys-tem can solve several problems inherentin a complex geographic information sys-tem

The boundaries of patches as we callthe areal extension of a page are chainsalready defined for the second structureSince detail along the chain is of no topo-logical interest the density of points alongthe chain can differ for its two sides Inother words the density of triangles canchange from patch to patch This allowsfor very efficient data encoding even interrain with sudden changes in the sur-face behavior as at the change from amountainous area into a plain (Peucker1972)

Another advantage of the paging-systemis the ease of including topographic andplanar information Linking point lineand areal data to the triangulated pointswould lead to high definitional redundancy

Vol 2 No1 APril 1975

The secondary structure could lead lttoam-biguities where the terrain is very elon-gated Since an attempt has been made tokeep the shape of the patches as compactas possible the combination of non-terraindata with patch boundaries seems to bemost appropriate

Since the patch boundaries are againchains another virtue comes to light Thepatches can be treated as polygons of thePOLYVRT system with little difficultyThis link between the two systems lendshope that eventually they may be merged

It is an appropriate question to ask whatsuch a data structure as the GDS will beable to accomplish A number of displayroutines have already been developed(Cochrane 1974) and a series of proce-dures for surface analysis based on heuris-tic searches are underway (Fig 9) Sinceboth levels of data structure are graphswe will be able to rely on many of the de-velopments connected with operations re-search specifically network analysis forthe manipulative treatment of the data

As both systems GDS and GEOGRAFhave topological structures it is possibleto merge the two The creation of poly-gons from points is the major link fromthe GDS project to GEOGRAF Thecreation of a set of centroids for polygonsallows the conversion in the opposite direc-tion This way surfaces can be treated aspolygonal sets and can be displayed andmanipulated by the routines of GEO-GRAF Conversely polygonal data canbe treated as surfaces for GDS Theneighborhood routines are what make theproject useful in quantitative geographyand planning Neighborhood searches areextremely expensive without the topologi-cal data structure but they are usually amost important part of urban and environ-mental analyses once a general overview isobtained from the data

Although basic research and applicationdevelopment are two sides of one coin andmust go together Ito obtain lasting resultsthis paper has concentrated on the theo-retical parts of the project since their de-velopment is ahead of the application rou-tines a fact which should be expected

67

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

REGULAR GRID

~GENERALIZATION

TRIANGULATEDIRREGULAR GRID

First Data Structure

jNETWORK PROCEDURES

THEORy OFSURFACE SPECIFICPOINTS AND LINES

HIURfSTIC SEARCH

SEARCH

Profile SearchGeographic DisorientationOptimum SearchNeighborhood An~lysis

ContouringBlock DiagramShaded ContoursInclined ContoursVisibility TestRadar Image

Fig 9 The problem flow for GDS from the data base via the extraction of points andchains for the two types of data structures to their application

The quintessence of the research so far isthe hypothesis that topologically-structureddata bases of three-dimensional and planarsurfaces can result in reduced efforts inthe development and execution of applka-

REFERENCESAmidon E L and G S Aiken (1970) Al-

gorithmic Selection of the Best Method forCompressing Map Data Strings Communica-tions of the Association for Computillg Ma-chinery Vol 14 pp 769-774

68

tion routines We have some indicationthat the hypothesis is correct the real testwill come when the bulk of the applicationroutines is completed

Arms A (1970) MAPMODEL System Sys-tem Descriptioll and Users Guide Bureau ofGovernmental Research and Service Univer-sity of Oregon 59 pp

Boehm B M (1967) Tabular Representation

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of Multivariate Functions with Applicationsto Topographic Modelling Proceedings Z2ndNational Conference Association for Comput-ing Machinery pp 403-415

Brandstaetter L (1957) Exakte Schichtlinienund Topographische GelaendedarstellungOesterreichische Zeitschrift fuer VermesSlmgs-wesepound Sonderheft 18 Wien 90 pp

Cochrane Douglas (1974) Cartographic Dis-play Operations on Surface Data Collected inan Irregular Structure MA thesis SimonFraser University

Cooke D F and W F Maxfield (1967) TheDevelopment of a Geographic Base File andIts Uses for Mapping Urban and RegionalInformation Systems for Social Programs Pa-pers from the 5th Annual Conference of theUrban and Regional Information System As-sociation pp 207-218

Douglas David H (1973) BNDRYNETPeucker T K (ed) The Interactive Map inUrban Research Final Report After Year OneUniversity of British Columbia

Dueppe R D and H J Gottschalk (1970)Automatische Interpolation von Isolinien beiwillkuerlich verteilten Stuetzpunkten Allge-meine Vermessungs-Nachrichten Vol 129 No10 (Oct) pp 423-426

Grist M W (1972) Digital Ground ModelsAn Account of Recent Research Photogram-metric Record Vol 70 No4 (Oct) pp 424-441

Heiskanen W A and H Moritz (1967) Phys-ical Geodesy W H Freeman San FranciscoChapter 7

Holroyd M T and B K Bhattacharyya(1970) Automatic CO1poundtouring of GeoPhysicalData Using Riclbic Spline Interpolation Geo-logical Survey of Canada Paper No 70-55

Hsu M L et al (1975) Computer Applica-tions in Land Use Mapping and the MinnesotaLand Management Information System Da-vis ] C and M McCullagh (eds) Displayand Analysis of Spatial Data New York JohnWiley and Sons

Junkins ] L G M Miller and J R Jancaitis(1973) A Weighting Function Approach toModelling of Irregular Surfaces Journal ofGeophysical Research Vol 78 No 11 (Apr)pp 1794-1803

Kraus K (1973) A General Digital TerrainModel translation from an article in Acker-mann F (1973) Numerische Photogram-metrie Sammlung Wiechmann Neue FolgeBuchreihe Bd 5 Karlsruhe

Laboratory for Computer Graphics and Spatial

Vol 2 No1 April 1975

Analysis Harvard University (1974) POLY-VRT Manual Cambridge Mass

Mark David M (1974) A Comparison ofComputer-based Terrain Storage MethodsWith Respect to Evaluation of Certain Geo-morphometric Measures MA thesis Uni-versity of British Columbia

Morse S P (1968) A Mathematical Modelfor the Analysis of Contouring Line DataJournal of the Association for Computing Ma-chinery Vol 15 No2 pp 205-220

Nake F and T K Peucker (1972) The Inter-active Map in Urban Research Report afterYear One University of British Columbia

Nordbeck S and B Rystedt (1970) IsarithmicMaps and the Continuity of Reference IntervalFunctions Geografiska Annaler Vol 52 SerB pp 92-123

Peucker T K (1972) Computer CartographyAssociation of American Geographers CollegeGeography Commission Resource Paper No17 Washington DC

Peucker T K (ed) (1973) The InteractiveMap in Urban Research Final Report AfterYear One University of British Columbia

Peucker T K (1974) GeograPhical DataStructltres Report After Year One SimonFraser University

Pfaltz J L (1975) Surface Networks Geo-graPhical Analysis (forthcoming)

Rhynsburger D (1973) Analytic Delineationof Thiessen Polygons Geographical AnalysisVol 5 No2 (Apr) pp 133-144

Rosenfeld A (1969) Picture Processing byComputer New York Academic Press

Schmidt W (1969) The Automap SystemSurveying and Mapping Vol 29 No 1(Mar) pp 101-106

Shepard D (1968) A Two-Dimensional Inter-polation Function for Irregularly SpacedData Proceedings 23rd National ConferenceAssociation for Computing Machinery pp 517-524

Sima J (1972) Prinzipien des CS digitalenGelaendemodells Vermessungstechnik Vol20 No2 pp 48-51

Switzer P (1975) in Sampling of PlanarSurfaces Davis J C and M McCullagh(eds) DisPlay and Analysis of Spatial DataJohn Wiley and Sons New York

Tobler W R (1969) Geographical Filters andtheir Inverses GeograPhical Analysis Vol 1pp 234-253

Warntz W (1966) The Topology of Socio-Economic Terrain and Spatial Flows Papersof the Regional Science Association Vol 17pp47-61 bull

69

Page 3: +>KMIBK>JCD?,>M> :MKN?MNKALkclarke/Geography232/PeuckerChrisman.pdfDelivered by Publishing Technology to: University of California Santa Barbara Library IP: 128.111.234.232 on: Tue,

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

at difIerent stages of clevelopl11llltthesetwo types can be treated as special casesof one topological data structure

DATA STRUCTURES FORPLANAR SURFACESTypes of Structures

The types of geographic entities onplanar surfaces are points lines and area-enclosing lines or polygons The latter areperhaps the most frequently encoded fea-ture in geographic data systems

The simplest data base system for planarsurfaces is that of encoding entity by entitywith little or no regard for entity overlapsor adjacencies (Fig 1) In other wordsevery polygon in a polygon system is en-coded and stored without any regard forcontiguous polygons and lines are encodedwithout regard for the fact that they mayintersect or merge with other lines Theresults of such an encoding are sliverlines (duplication of lines in slightly dif-ferent positions) These sliver lines areconfusing and unaesthetic and hence it isvirtually impossible to do anything directlywith such a data base except an extremelycoarse graphic image

To go beyond the use of such data forthe production of coarse images editingmust be performed This alternative hasbeen attempted in several cases one beingthe MAP-MODEL system (Arms 1970)The editing in this system is guided by theassumption that every segment has to berepresented twice except for segments onthe outer boundary For each segmentthe editing program sorts through all re-maining segments to find its complement(the identical line of the neighboring poly-gon) Those segments for which it hasnot found a complement are tagged to in-dicate potential errors to the user

To overcome some of the limitations ofindependently encoded entities systemshave been developed based on a commonlocation dictionary This dictionary con-tains the coordinates of every boundarypoint on the map Polygon boundary listsare then compiled which consist of thelabels (location numbers) of these bound-ary points (Fig 2) Line information can

fol 2 No1 April 1975

LOCATION LIST

POLYGON 1 X1 YI

x2 Yl2

Xln Ytn

POLYGON 2 X21 Y21

119 22 Y22

Fig 1 The simplest encoding is areal en-tity by areal entity In this system mostpoints are recorded twice and some (egP14 P2 2 Pa 7) three times A point doesnot nec~ssarily have identical coordinates inall recordings

be handled in the same way Programsbased on this structure include CAL-FORM from the Laboratory of ComputerGraphics and Spatial Analysis Otherprograms have subroutines to convert thispoint dictionary structure to the simpleentity-by-entity line list described aboveThe data canmiddot then be used in programssuch as SYMAP (Laboratory for Com-puter Graphics and Spatial Analysis)which are compatible with the entity-byentity structure Programs have also beendeveloped to simplify data input throughautomated polygon identification (Douglas1973)

57

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

POLYGON

3 125 242322214344454647483029 28 27261

5 4443 424140 393837 3635 44

Fig 2 In the second type of polygon sys-tem the individual points are encoded onlyonce and are stored in the Point DictionaryFor every polygon a Polygon Boundary Listis then established

The point dictionary data base has theadvantage that sliver lines do not occurHowever the problem of neighborhoodrelationships is not handled any better thanwith the entity-by-entity approach Thesearch for common lines is no longer ac-cording to the coordinates of points but bytheir labels this brings us closer to asolution only by a little less computer timeIt also creates difficulties A point diction-ary can and will be accessed in an arbitraryorder since there are no restrictions re-garding point placement The standard

1

POINT DICTIONARYPOINT

1 x y2 x y3 x y

48 x y

response to this problem is to make thedictionary core resident Unfortunatelythis will limit the complexity of the mapthat can be handled in this manner sinceall points must be stored throughout theoperation of the program This short-coming of such sharing of data is aug-mented by the continued independence ofthe entities created by the dictionary in-stead of n points with their r and y co-ordinates there are simply n referencesto points

Some of the objections to the ordinarypoint dictionary approach can be eliminatedby formulating an intermediate object be-tween the entity and the points used as anaddressing scheme (Nake and Peucker1972 Peucker (ed) 1973) A geographicentity can be created from a list of line seg-mentso these segments are in turn createdfrom references to the point dictionaryThis system allows for easy definition ofthe entities with a minimum of pointersbut each entity is still independent in thesense that its neighbors are not knownThe direction of access is still from entityto location but not the reverse

All of the data structures described thusfar are of limited flexibility and utilitybecause neighborhood relationships are notknown By adding the topological neigh-borhood function of each element to a datastructure large improvements in flexibilityand scope of applications can be realized

If one is concerned about the memorycapacity which is needed for the storage ofexplicit neighborhood relationships onemight consider a system with implicitneighborhood functions by modifying theentity form in the encoding stage Manyexisting geographic information systemsstore land-use data in grids of rectangularcells (Hsu 1975) However a seriousproblem is encountered with such a regulardiscrete encoding of planar surfaces Ac-cording to the sampling theorem thesampling interval has to be half the size ofthe smallest features to be encoded (egTobler 1969) Hence either the size ofthe cells has to be very small to enable en-coding of detailed variations (eg urbanland uses) or a grosser cell size can be

58 The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

CENSUS ADDRESS CODING GUIDE RECORDS

DIME STREET SEGMENT RECORDS

STREET NODE NODE TRACT BLOCK TRACT BLOCK LOW HIGHSTART END LEFT LEFT RIGHT RIGHT ADDR ADD~

Main 5 6 105 104 1 10Main 6 7 106 103 11 17Main 7 8 101 103 19 28Malt 8 9 101 102 30 42

Fig 3 The DIME-file base Representedis a portion of Tract 1 in a hypothetical city(Cooke and Maxfield 1967) The smallnumbers at middotthe street intersections are thenodes the larger numbers on Main Streetare the addresses Two types of records arccreated as illustrated in the tables

]D-_2 10_3__ ~

]I 105 JD[4 GROVE STREET 3

HIGH ADDRESS

4228

1091741

LOW ADDRESS

3012211119

28

419~0[~

8 ELM STREET 11

BLOCK

102103104105106101

~TRACT ~~

19

CHURCH

101

MainMain

MainMainMainMain

STREET TRACT

system for the Address Coding effort ofthe 1970 census

The basic element of the DIME file is aline segment defined by two end pointsIt is assumed that the segment is straightand not crossed by any other line Themetropolitan files usually define this unitas a street block face Complex lines arerepresented by a series of segments ap-proximating the line The segment hastwo node identifiers along with the co-ordinates of its two end points and codesfor the polygon on each side of the seg-ment

While DIME topology makes much 111-

Structures Using ExplicitTopological Relationships

One of the first known attempts to in-corporate explicit topological structure intoa geographic data base is the Dual Inde-pendent Map Encoding (DIME) systemof the US Bureau of the Census (Fig 3)The DIME files were originally developedas an automated topological error detection

used for encoding only the very slowlychanging variations (generalized landuse) In the case of the small units a verylarge volume of redundant information iscreated in an area of uniform land use andmatrix reduction techniques such as run-length encoding (recording in each rowonly those cells differing from an immedi-ate]y previous cell) create only physicaland not logical compaction (Amidon andAkin 1970) In the case of the grossercell size highly varying land use featuresare aggregated to a degree which reducesthe usefulness of the whole system

Beyond sampling problems a grid struc-ture imposes a bias toward specific orienta-tions of features Diagonals althoughphysically longer are given the apparentcell relationships of unit distances alongthe major axes The grid structure alsocreates the illusion of working with a dis-crete point space rather than a regularareal partitioning

Recent studies have developed param-eters which determine the degree of in-accuracy of a given sampling mesh(Switzer 1975) In cases where sometypes of geographic features vary oversmall areas and others over very largeareas such as the case of statewide land-use patterns the error is averaged over allfeatures and will affect features with ahigh-frequency variation more than others

The variation of the mesh size within theinformation system would be of only littlehelp The data management and the ma-nipulation and display routines would be-come more complex although probably lessthan many researchers imagine Exactfigures about the differences cannot begiven since no literature about such a sys-tem is known to the authors

jmiddotol 2 No1 April 1975 59

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

formation accessible to urban researchersneighborhood relationships are not madeexplicit Segments sharing a node forexample must be found by laborious searchprocedures Search is also required toassemble the outline of a polygon Moreimportantly the DIME structure is cum-hersome to use for many cartographic ap-plications involving areas made up of com-plex lines For procedures in a one-timechecking effort as is the case of AddressCoding and Address Matching in metro-politan areas it is quite adequate Forefficient computer storage and retrievaland for many applications however im-provements are desirable For examplethe reliance on the individual line segmentmakes the reduction of detail for displaypurposes difficult since line segments can-not be simply deleted without correctingthe reference codes for the affected nodes

At the Laboratory for Computer Graph-ics and Spatial Analysis the junior authorhas developed a data structure POLY-VRT that is designed to contain all theinformation needed to construct any of thepreviously enumerated planar structuresThe basic object of POLYVRT is thechain Like a DIME segment a chainhas nodes at its two ends separates twoareas and is assumed to be uncrossed Itdiffers in that the POLYVRT chain maybe made up of many points whereas thebasic DIME unit has only two points Aboundary between two polygons can bereferenced by a single chain no matterhow complicated because line detail istopologically unimportant (Laboratory forComputer Graphics and Spatial Analysis1974)

The coding of a complicated boundaryas a unit is not unique to POLYVRTThe data bank used in the project TheInteractive Map in Urban Research(Nake and Peucker 1972 Peucker (ed)1973) as well as the World Data Bank I(Schmidt 1969) are composed of linesIn the latter case some of them containover 4000 points The chain based systemof POLYVRT however is a differenttype of structure because of the topologicalrole assigned to the chain and the subse-

60

l[ucnt construction of a list data strnctureBased upon this assignment the topologi-cal information about a chain resembles theinformation on a DIME record exceptthat the distinction between nodes (iepoints used for more than one chain) andthe points internal to a chain allows in-ternal points to be eliminated without influencing the neighborhood relationshipsThe main innovation in developing a chainrepresentation is that areas of significantline detail may be efficiently handledTopological checking is reduced from de-pendency on the number of points to de-pendency on the number of boundaries

In addition to the indication of the rela-tive location of the chain with respect toits neighboring polygons POL YVRT in-formation is stored in separate lists assem-bling the bounding chains for every poly-gon Thus searches can take place in twodirections from the chain to the polygonand from the polygon to the chain Thisis very important for any type of neighbor-hood manipulation since neighboring enti-ties can be found through their boundingor bounded complements In otherwords to follow along a group of chainsone flips through chain to polygon to thenext chain etc whereas to traverse aseries of polygons one tests for adjacentpolygons by going through the chain di-rectory for each polygon

The POLYVRT program places pointinformation in secondary storage Thethree higher level objects (chains nodesand polygons) are core resident Onlythe chain refers directly to the point filein the secondary storage In addition toindicating the locations of the points in thepoint file the chain record incorporates thename of the chain the labels of the startingand ending nodes and the left and rightpolYJ~middotons Conversely the polygon listconsists of the bounding chains in propersequence (Figs 4 and 5) A list linkinsnodes to chains could easily be constructed

DATA STRUCTURES FORTHREE-DIMENSION AL SURFACES

Boehm (1967) describes in a very de-tailed analysis the advantages and disad-

The American Carfographrr

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

2 POLYGON 22 NODE

888 CHAIN 3 POINT WITHINCHAIN

Fig 4 The external representation of thePOLYVRT chain-file Every chain has aname the number of inner points (length)the two limiting nodes the two boundarypolygons and a series of coordinates for thepoints NODES

prisingly this topic has produced little dis-cussion despite the fact that extremelylarge data banks of terrain (digital terrainmodels) have been developed

When encoding surfaces it is necessaryto adapt the density of points to the varia-tion in the local terrain The question ofhow dense the points have to be can beanswered by again using the philosophy ofthe sampling theorem For the terrainwithin a typical map the variation canchange considerably resulting in a needfor frequent adjustments of the samplinginterval

In a contour map the density of contourlines changes with the density of reliefvariation Therefore it fulfills the re-quirements set by the sampling theoremfor a non-stationary surface ie a sur-face with changing terrain For the regu-lar grid on the other hand if the smallestobject one wishes to detect anywherewithin a study area is of size (wavelength) 5 then the grid spacing every-where must be 52 or less (note for ex-ample Mark 1974) The regular gridagain tends toward redundancy sincesmooth areas within the study area willcontain far more points than are needed toaccurately portray their form To improvethe resolution of a grid by a factor oft the grid spacing must be decreased by

44 11 3

TO LEFT RIGHT

22 77 0

5

18

x1ylmiddotmiddotmiddotmiddotmiddotXSy5

4 55 77 2

11

111

999

11

Name J

Fig 5 The internal representation of thePOLYVRT chain-file Reference to poly-gon chains are positive or negative accord-ing to the direction of the chain used

vantages of different ways of encoding sur-faces He comes to the conclusion that theencoding of surfaces by contours minimizesthe storage capacity whereas a regulargrid of surface points minimizes the com-puting time necessary for several types ofmanipulations Boehm did not include inhis study a data system with an irregulardistribution of points but only contour en-codings and regular grid structures of con-stant and variable mesh width He didnot reflect on the reasons why contoureddata minimize storage capacity whereas aregular grid minimizes computing time Ifhe had done so it is quite possible that thedevelopment of geographic data baseswould have taken different routes Sur-

22 X Y

Y

77 Y

CHAINSName Poinls from to Ltlll Right

888 22 55

222 55 77

77

POINTS

Secondary Storagex y detail

POLYGONSUst Point Namemiddotm

al 2 No1 April 1975 61

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

the reciprocal of this factor and the totalnumber of points is increased by p

The question has been raised many timesin photogral11t11etrywhether contours arethe best representatives of topographicsurfaces It has been stated that contoursdo not detect many types of breaks whichare frequent on terrain (eg Brandstatter1957) Therefore the encoding of sur-faces by vertical profiles has beenattemptedmiddot many times in photogrammetryin recent years Points are encoded onlywhen the slope of the surface changes(eg Kraus 1973) In another case thebreak lines are encoded in addition to aregular grid (Sima 1972) and in anotherapproach only the break lines or only theridges and channel lines of a surface areencoded (Grist 1972) The amount ofdetail with these approaches depends onthe scale used

vVhen performing numerical computa-tions on the basis of these digital terrainmodels the quantity of data involved willbe only one determining factor for theamount of programming and computationsneeded 11ost of the numerical computa-tions on surfaces require some type ofneighborhood function either to computesome surface behavior such as slope orlocal variation of relief or to find the nextunit for the drawing of a contour or avertical profile For a set of contours itis relatively easy to create a directorywhich indicates a sequence of contours ina type of tree in which the surroundingcontour is the base and the other contoursare the branches (Morse 1968) To findthe neighboring points on the two ad-jacent contours for a given point on anintermediate contour however one mustsearch through all the points on the twoadjacent contours and compute the dis-tances to the point in question This pro-cedure is time-consuming if the contourlines are represented by very many pointsA regular grid on the other hand has animplicit neighborhood function and findinga neighbor does not involve search norextra computer time For a set of irregu-larly-distributed points as they are repre-sented in very simple data structures (eg

62

SYMAP) the creation of a neighborhoodfunction is usually performed by findingthe closest small number of points wherethe number varies around six

THE GENERAL CASE

I t has been noted that although planesurfaces and three-dimensional surfaceslook rather different it takes only a fewassumptions to treat one as the subset ofthe other While three-dimensional sur-faces are always based on interval or ratiodata planar surfaces often involve ordinaland nominal data However it has beenshown that ordinal and nominal data maybe treated as interval data (N ordbeck andRystedt 1970 Rosenfeld 1969) and evenwithout this conversion we can combinethe two types into one general case bysimply using different assumptions aboutneighborhood

Given a set of n data points for whichx y and z coordinates are known con-tinuously defined surfaces (ie surfacesfor which one and only one value existsat every point) can be created using anyof three different assumptions about thesurface behavior (Peucker 1972) Thefirst assumption is that of a stepped sur-face which says that the surface retainsthe value of a data point within the neigh-borhood of that data point where neigh-borhood is defined either by a given poly-gon (the choropleth approach) or by thefact that the area is closer to one datapoint than to any other one in the neighbor-hood (the proximal approach) The sec-ond assumption is that each data pointrepresents a sample of a single value on aconstantly changing surface Neighbor-hood is then a number of closest neighbor-ing data points and intervening values areinterpolated with different types of in-terpolation procedures The third assump-tion is that the data point is a sample froma constantly changing surface that maycontain errors thus the data point is notnecessarily located on the surface butclose to it This approach implies thefurther assumption that the actual surfaceis smoother than the surface constructedthrough the sample data points

The American CartograPher

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

linFeatures

ChainGroup

vVith these assumptions about surfacebehavior any point or areal distributionof a variable can be treated as a continuousfunction z = f (x y) and planar and three-dimensional surfaces can be combined intoone type This has already been done insome computer progranls the most notablebeing SYMAP Many cartographic ap-plications must treat both types of sur-faces and therefore data structures mustbe developed which can handle both typesat one time

THE PROPOSED DATA STRUCTURES

Planar Surfaces

The need to incorporate different typesor hierarchies of polygons uncovered oneof the limiting assumptions made inPOLYVRT A chain plays a dual rolefirst it is the boundary of two areal enti-ties and secondly it is the unbroken unitof point retrieval This is not a problemif one is limited to a single nonoverlappingset of polygons The following proposednew system to bear the name GEOGRAFis an attempt at greater flexibility

Because of the addition of many layersof complexity involving multiple polygonsets the chain cannot remain to the samedegree the controlling object of the datastructure Just as the notion of an un-broken line is important so is the notionof an unpartitioned space In a systemwhich must handle overlapping polygonnetworks there is a need for a root ob-ject which is defined as an area uncut byany further partitioning This object istermed the Least Common GeographicUnit (LCGU) The LCGUs are con-structed as a POLYVRT polygon directlyfrom chains The relationship of theLCGDs to all other polygon types is hi-erarchical (Fig 6)

In turn the existence of the LCGUallows for the creation of each class ofpolygon In order to allow simple codingof the boundary relationships at thesehigher levels in the structure the chaingroup was devised A chain group is aset of chains which form a boundary of twoareal units for a given polygon classThese polygons are constructed of chain

0 2 No1 APril 1975

Cha DGroup === LiI I

EJ 8ha0ENodes -- -- LCGU 0

- - -LJ ~ 1

RFig 6 The relationship between the vari-ous parts of the data structure

groups which in turn are constructedfrom chains

Line features can be built up of chainsin the same manner as chain groupsNote that the chain group listing for eachlevel of polygon and each listing for a linefeature reference only the chains them-selves This allows each system to beconsidered as a separate directory whichis core-resident only when that class ofobjects is retrieved

The LCGU has other implications andapplications that are useful because of thetopological data structure The LCGUwith its coding for each of the polygonsets can be combined with contiguity in-formation of linear feature types to producean Attribute Cross Reference (A CR) The ACR is a table in which all objects(in polygon and linear systems) are cross-referenced to each other to determinenesting By using attributes of chains(lengths) and LCGUs (areas populationdensities etc) this cross-referencing cap-ability can assign a string of data collectedfor one polygon type to a string of asecond type

Topological manipulation routines arecentral to the success of this structureThe intersection of geographic featureswill rely on topological knowledge to re-alize economies of scale in processing largefiles All operations with lines will actu-ally work with bands built with endpoints

63

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of the line and the furthest deviants toboth sides (if the bands become too widethe lines are split etc) With thisapproach the windowing process uses non-linear windows (which is often the casewith geographic coordinates and map pro-jections) and becomes quite elegant Sim-ilarly intersection procedures such aspoint-in-polygon line-across-polygon andpolygon-over-polygon determination allowfor gainful application of the topologicalprinciple For point-in-polygon searcheschain groups are constructed which bisectthe universe into parts to sort the points(or nodes of polygons) into three groupsleft within the band and right Only thesecond group needs more detailed treat-ment The point set is then recursivelypartitioned until it reaches the level of theLeGDs For line-oriented problems thetopological connections of the two sets com-pared would allow intersections to belimited to immediate neighbors

Another important procedure will be anested chain-intersection routine Hereagain the chain band and its recursivesegmentation is used The number ofpoints which define a chain is constantlyincreased until the intersection test can bedetermined The search of a line througha set of polygons will use a graph-searchalgorithm developed for the GDS (Geo-graphic Data Structure) project discussedin the next section The neighborhoodsearch routines create records of neighborsfor every point or line or polygon at anyspecified depth of neighborhood

Three-Dimensional Surfaces

To implement the ideas presented herethe senior author is developing a geo-graphic information system for three-dimensional surfaces Both systems thethree-dimensional and the planar arebased on data structures with explicittopological neighborhood relationships

The basic philosophy of both approachesis to separate the data base from the appli-cation programs In the early days ofcomputer cartography a data set wasgenerally tied to the application programthat used it Now the available data have

64

become so voluminous and the applicationprograms so varied that extra efforts inthe preparation of data for more efficientcomputations seem to be justified We arewell in step with modern computer scienceto separate the data base from the applica-tion programs with the data structure be-coming the link between the two

The creation of a structured data baseis nothing new The interpolation of anirregular grid to a regular grid of height-points provides a data structure throughthe implicit neighborhood function of thegrid And in the case where polygons areindependently defined by a series of pointsthe neighborhood relationship is replacedby a search algorithm which finds theneighbors for every polygon by searchingfor matching segments in the boundaryfiles of the other polygons The idea is tospend computing time before any applica-tion has been performed in the anticipa-tion of heavy uses of the data base Anefficient structure of the data base shouldnot only speed up computations consider-ably but should also simplify the produc-tion of application programs

The data structure developed under theworking title Geographic Data Structure(GDS) is based on irregularly distributedpoints which are assumed to be samplepoints without sampling errors from asingle-valued surface Two types of struc-tures form the core of the data bases Thefirst creates neighborhood relationships bytriangulating the data set and storing forevery point the labels of all points whichare linked with the point by a triangleedge (Fig 7) The second structure isproduced by selecting those points of thesurfaces which lie along lines of high in-formation content such as ridges andchannel lines and defining them by theirnodes which are peaks passes and pits(Fig 8) This second data structureserves two purposes First ~t is a generalrepresentation of the surface for roughcomputations second it is a directoryinto the more detailed first structure

In the first structure the creation of theneighborhood relationship is based on theassumption that data-sets are usually of

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

34 16 3S 53 52 33 15

11 4 18 36 35 16 38

CHAINS

POINTE~

NODES

Fig 8 The GDS-second data structureBoth the node-file and the chain-file haveaccess to the first data structure the node-file directly and the chain-file through achain-painter-file

Gottschalk 1970) creates all possible linkschooses the shortest and eliminates alllinks which intersect the shortest Thisprocedure is repeated with the next short-est links until no links intersect The re-sult is the set of links with the minimumcumulative distance between neighboringpoints

The procedure has one disadvantagesince (~ ) links have to be created the num-ber of points is therefore limited toonly several hundred The first step inour approach therefore limits the links toa number of potential neighbors amongwhich the shortest link is chosen and in-tersected with all other links originatingin these potential neighbors This pro-cedure limits the number of tests for in-

nm2

tersections of links to less than -4-where m is the number of potential neigh-bors an arbitrary number between 8 and14 depending on the density variation ofthese points The procedure does notguarantee however that only triangles areconstructed polygons with more thanthree sides can result although they arerelatively rare The check for such poly-gons and their elimination is very easy andfast

The second possibility is to create atriangulated structure through use ofThiessen polygons A published solution(Rhynsburger 1973) intersects for everypoint the links to every other point midwayand chooses the smallest polygon created

~COIDJ-OIIIJI II POINTS POINTERS CHAIN POINTERSL Nm bull

[[]]-DI ~~u =

317 35 34 1516

Points Pcinters

POINTS POINTERSName x y bull Pointers

r--1735

~ i34

6 X Y z -- 15 -~ 17 X Y z i+6 23

18 X Y z i+13

~

-bull183635- 16

3 X Y z j 3

~

~l

----

NEIGHBORHOOD RELATIONSHIPS

TRIANGULATED POINTS

Fig 7 The GDS-first data structure Theillustration shows the points on the surfacewith their links to neighbors (edges) Theexternal representation is shown by theneighborhood relationships The internalrepresentation is composed of the point-fileand the pointer-file

two types (a) sets of irregularly-distri-buted points which were digitized with theunderstanding that every point is signif-icant and (b) sets of regularly or irregu-larly-distributed points where it is knownthat a number of points are redundant andcan be eliminated from the set eg regulargrids of points and encoded contours

The first type of data set is linked bysome type of triangulation At least twoapproaches exist The first (Dueppe and

Vol 2 No1 April 1975 65

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

by the perpendiculars Every point whichcontributes to the Thiessen polygon is aThiessen neighbor This procedure canagain be simplified by the assumption of alimited set of potential neighbors Thesame checking routines as above have tobe applied

Another approach which limits the num-ber of necessary tests but is mathemat-ically correct at the same time has beendeveloped within the project by KurtBrassel at Harvard University The pro-cedure is based on fields of potentialneighbors which converge very rapidly

The alternative to triangulation is three-dimensional generalization ie to selectfrom a set of points those which define thestructure with the least deviation from theoriginal surface The basic concept de-veloped by Randolph Franklin of HarvardUniversity and T K Peucker is to ap-proximate the surface of a series of tri-angles through a selected set of pointswhere each additional point included intothe set is the one which deviates the mostfrom the approximated triangles until thedeviations are below a given value (seePeucker 1974)

In both cases the triangulation and gen-eralization of the surface the result is alinked list of surface points The termlinked list means that points are linked withone another through pointers In otherwords a point is not only identified by its y z coordinates but also by a list of thelabels of the points which form edges oftriangles with the point In our case eachrecord consists of the t y z coordinates ofa point and a reference to the start of thepointers to the neighbors in a pointer listThe reason for not having the neighbor-hood pointers with the point record is thatthe number of points varies considerably(Mark 1974) Since the record has to belong enough to include all possible num-bers of neighbors large parts of the pointersections would be empty most of the timeThe pointers are sorted starting with thepointer the least East of North of a point(Fig 7)

The use of this type of data-structure isvery simple and efficient For every search

66

(profile contour etc) a criterion for edge-intersection is developed For contouringfor example the critical question iswhether one point of the edge is above thecontour level and the other below A startis found and one point of the edge is con-sidered a reference point and the other asubpoint The next subpoint is found bylooking up the next neighbor in the pointerlist If the test is positive the intersectionis performed and the process repeated Ifthe test is negative the reference and sub-points are switched and the process re-peated

Other procedures are equally simpleTo find a triangle for example one hasonly to have a reference and a subpointThe third point is the next label in thepointer list of the reference point after thesubpoint To find all triangles one goesthrough the total pointer list leaving outall those edges connecting reference pointswith subpoints with a smaller label sincethey would create triangles which hadbeen treated when the subpoint was areference point

Although the first data structure aspresented above seems to be efficient interms of storage capacity it does not pro-vide easy access to the data base which isoften very large I t is for this reasonthat we are developing the second datastructure to represent the general structureof the surface and to serve as a directoryto reach into the first data base (Pfaltz1975) (Fig 8)

The first step in the creation of thesecond data structure is to find the ridgechannel and break lines on the surface Ifone labels the highest point for every tri-angle the unlabeled points are membersof the channel line at least on smoothsurfaces A subsequent search routinedeals with the irregularities Points alongridges are found by eliminating the lowestpoint of every triangle using a routinedeveloped for a regular grid by D MDouglas The detection of break lines issomewhat more difficult

Some theoretical studies of surfaces byWarntz (1966) show that ridge lines andchannel lines cross at passes a useful point

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of information relative to the developmentof the second data structure Practicalconsiderations suggest that in terrain andother surfaces this regularity is not alwayspresent

Once the topological structure is at handit can be used as a directory into the firststructure The line is therefore treatedas a chain similar to the chain of thePOLYVRT system The nodes of thechain are the peaks passes pits and otherendpoints of chains on the surfaces Thesepoints are stored with their coordinatesand the names of the chains which termi-nate at the nodes The chains are storedwith the labels of the nodes and pointersinto the chain lists which consist of labelsof points in the first data structure (notethat here the chain structure differs fromthat of POLYVRT)

A third component of the GeographicData Structure should be mentioned sinceit illustrates very well the logical adapta-tion of a computer problem solution togeographical data The problem at handis the partitioning of the data set Sincewith large data sets only portions can bekept in fast memory the data base is seg-mented into pages which are broughtinto memory as units For the Geo-graphic Data Structure the paging sys-tem can solve several problems inherentin a complex geographic information sys-tem

The boundaries of patches as we callthe areal extension of a page are chainsalready defined for the second structureSince detail along the chain is of no topo-logical interest the density of points alongthe chain can differ for its two sides Inother words the density of triangles canchange from patch to patch This allowsfor very efficient data encoding even interrain with sudden changes in the sur-face behavior as at the change from amountainous area into a plain (Peucker1972)

Another advantage of the paging-systemis the ease of including topographic andplanar information Linking point lineand areal data to the triangulated pointswould lead to high definitional redundancy

Vol 2 No1 APril 1975

The secondary structure could lead lttoam-biguities where the terrain is very elon-gated Since an attempt has been made tokeep the shape of the patches as compactas possible the combination of non-terraindata with patch boundaries seems to bemost appropriate

Since the patch boundaries are againchains another virtue comes to light Thepatches can be treated as polygons of thePOLYVRT system with little difficultyThis link between the two systems lendshope that eventually they may be merged

It is an appropriate question to ask whatsuch a data structure as the GDS will beable to accomplish A number of displayroutines have already been developed(Cochrane 1974) and a series of proce-dures for surface analysis based on heuris-tic searches are underway (Fig 9) Sinceboth levels of data structure are graphswe will be able to rely on many of the de-velopments connected with operations re-search specifically network analysis forthe manipulative treatment of the data

As both systems GDS and GEOGRAFhave topological structures it is possibleto merge the two The creation of poly-gons from points is the major link fromthe GDS project to GEOGRAF Thecreation of a set of centroids for polygonsallows the conversion in the opposite direc-tion This way surfaces can be treated aspolygonal sets and can be displayed andmanipulated by the routines of GEO-GRAF Conversely polygonal data canbe treated as surfaces for GDS Theneighborhood routines are what make theproject useful in quantitative geographyand planning Neighborhood searches areextremely expensive without the topologi-cal data structure but they are usually amost important part of urban and environ-mental analyses once a general overview isobtained from the data

Although basic research and applicationdevelopment are two sides of one coin andmust go together Ito obtain lasting resultsthis paper has concentrated on the theo-retical parts of the project since their de-velopment is ahead of the application rou-tines a fact which should be expected

67

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

REGULAR GRID

~GENERALIZATION

TRIANGULATEDIRREGULAR GRID

First Data Structure

jNETWORK PROCEDURES

THEORy OFSURFACE SPECIFICPOINTS AND LINES

HIURfSTIC SEARCH

SEARCH

Profile SearchGeographic DisorientationOptimum SearchNeighborhood An~lysis

ContouringBlock DiagramShaded ContoursInclined ContoursVisibility TestRadar Image

Fig 9 The problem flow for GDS from the data base via the extraction of points andchains for the two types of data structures to their application

The quintessence of the research so far isthe hypothesis that topologically-structureddata bases of three-dimensional and planarsurfaces can result in reduced efforts inthe development and execution of applka-

REFERENCESAmidon E L and G S Aiken (1970) Al-

gorithmic Selection of the Best Method forCompressing Map Data Strings Communica-tions of the Association for Computillg Ma-chinery Vol 14 pp 769-774

68

tion routines We have some indicationthat the hypothesis is correct the real testwill come when the bulk of the applicationroutines is completed

Arms A (1970) MAPMODEL System Sys-tem Descriptioll and Users Guide Bureau ofGovernmental Research and Service Univer-sity of Oregon 59 pp

Boehm B M (1967) Tabular Representation

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of Multivariate Functions with Applicationsto Topographic Modelling Proceedings Z2ndNational Conference Association for Comput-ing Machinery pp 403-415

Brandstaetter L (1957) Exakte Schichtlinienund Topographische GelaendedarstellungOesterreichische Zeitschrift fuer VermesSlmgs-wesepound Sonderheft 18 Wien 90 pp

Cochrane Douglas (1974) Cartographic Dis-play Operations on Surface Data Collected inan Irregular Structure MA thesis SimonFraser University

Cooke D F and W F Maxfield (1967) TheDevelopment of a Geographic Base File andIts Uses for Mapping Urban and RegionalInformation Systems for Social Programs Pa-pers from the 5th Annual Conference of theUrban and Regional Information System As-sociation pp 207-218

Douglas David H (1973) BNDRYNETPeucker T K (ed) The Interactive Map inUrban Research Final Report After Year OneUniversity of British Columbia

Dueppe R D and H J Gottschalk (1970)Automatische Interpolation von Isolinien beiwillkuerlich verteilten Stuetzpunkten Allge-meine Vermessungs-Nachrichten Vol 129 No10 (Oct) pp 423-426

Grist M W (1972) Digital Ground ModelsAn Account of Recent Research Photogram-metric Record Vol 70 No4 (Oct) pp 424-441

Heiskanen W A and H Moritz (1967) Phys-ical Geodesy W H Freeman San FranciscoChapter 7

Holroyd M T and B K Bhattacharyya(1970) Automatic CO1poundtouring of GeoPhysicalData Using Riclbic Spline Interpolation Geo-logical Survey of Canada Paper No 70-55

Hsu M L et al (1975) Computer Applica-tions in Land Use Mapping and the MinnesotaLand Management Information System Da-vis ] C and M McCullagh (eds) Displayand Analysis of Spatial Data New York JohnWiley and Sons

Junkins ] L G M Miller and J R Jancaitis(1973) A Weighting Function Approach toModelling of Irregular Surfaces Journal ofGeophysical Research Vol 78 No 11 (Apr)pp 1794-1803

Kraus K (1973) A General Digital TerrainModel translation from an article in Acker-mann F (1973) Numerische Photogram-metrie Sammlung Wiechmann Neue FolgeBuchreihe Bd 5 Karlsruhe

Laboratory for Computer Graphics and Spatial

Vol 2 No1 April 1975

Analysis Harvard University (1974) POLY-VRT Manual Cambridge Mass

Mark David M (1974) A Comparison ofComputer-based Terrain Storage MethodsWith Respect to Evaluation of Certain Geo-morphometric Measures MA thesis Uni-versity of British Columbia

Morse S P (1968) A Mathematical Modelfor the Analysis of Contouring Line DataJournal of the Association for Computing Ma-chinery Vol 15 No2 pp 205-220

Nake F and T K Peucker (1972) The Inter-active Map in Urban Research Report afterYear One University of British Columbia

Nordbeck S and B Rystedt (1970) IsarithmicMaps and the Continuity of Reference IntervalFunctions Geografiska Annaler Vol 52 SerB pp 92-123

Peucker T K (1972) Computer CartographyAssociation of American Geographers CollegeGeography Commission Resource Paper No17 Washington DC

Peucker T K (ed) (1973) The InteractiveMap in Urban Research Final Report AfterYear One University of British Columbia

Peucker T K (1974) GeograPhical DataStructltres Report After Year One SimonFraser University

Pfaltz J L (1975) Surface Networks Geo-graPhical Analysis (forthcoming)

Rhynsburger D (1973) Analytic Delineationof Thiessen Polygons Geographical AnalysisVol 5 No2 (Apr) pp 133-144

Rosenfeld A (1969) Picture Processing byComputer New York Academic Press

Schmidt W (1969) The Automap SystemSurveying and Mapping Vol 29 No 1(Mar) pp 101-106

Shepard D (1968) A Two-Dimensional Inter-polation Function for Irregularly SpacedData Proceedings 23rd National ConferenceAssociation for Computing Machinery pp 517-524

Sima J (1972) Prinzipien des CS digitalenGelaendemodells Vermessungstechnik Vol20 No2 pp 48-51

Switzer P (1975) in Sampling of PlanarSurfaces Davis J C and M McCullagh(eds) DisPlay and Analysis of Spatial DataJohn Wiley and Sons New York

Tobler W R (1969) Geographical Filters andtheir Inverses GeograPhical Analysis Vol 1pp 234-253

Warntz W (1966) The Topology of Socio-Economic Terrain and Spatial Flows Papersof the Regional Science Association Vol 17pp47-61 bull

69

Page 4: +>KMIBK>JCD?,>M> :MKN?MNKALkclarke/Geography232/PeuckerChrisman.pdfDelivered by Publishing Technology to: University of California Santa Barbara Library IP: 128.111.234.232 on: Tue,

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

POLYGON

3 125 242322214344454647483029 28 27261

5 4443 424140 393837 3635 44

Fig 2 In the second type of polygon sys-tem the individual points are encoded onlyonce and are stored in the Point DictionaryFor every polygon a Polygon Boundary Listis then established

The point dictionary data base has theadvantage that sliver lines do not occurHowever the problem of neighborhoodrelationships is not handled any better thanwith the entity-by-entity approach Thesearch for common lines is no longer ac-cording to the coordinates of points but bytheir labels this brings us closer to asolution only by a little less computer timeIt also creates difficulties A point diction-ary can and will be accessed in an arbitraryorder since there are no restrictions re-garding point placement The standard

1

POINT DICTIONARYPOINT

1 x y2 x y3 x y

48 x y

response to this problem is to make thedictionary core resident Unfortunatelythis will limit the complexity of the mapthat can be handled in this manner sinceall points must be stored throughout theoperation of the program This short-coming of such sharing of data is aug-mented by the continued independence ofthe entities created by the dictionary in-stead of n points with their r and y co-ordinates there are simply n referencesto points

Some of the objections to the ordinarypoint dictionary approach can be eliminatedby formulating an intermediate object be-tween the entity and the points used as anaddressing scheme (Nake and Peucker1972 Peucker (ed) 1973) A geographicentity can be created from a list of line seg-mentso these segments are in turn createdfrom references to the point dictionaryThis system allows for easy definition ofthe entities with a minimum of pointersbut each entity is still independent in thesense that its neighbors are not knownThe direction of access is still from entityto location but not the reverse

All of the data structures described thusfar are of limited flexibility and utilitybecause neighborhood relationships are notknown By adding the topological neigh-borhood function of each element to a datastructure large improvements in flexibilityand scope of applications can be realized

If one is concerned about the memorycapacity which is needed for the storage ofexplicit neighborhood relationships onemight consider a system with implicitneighborhood functions by modifying theentity form in the encoding stage Manyexisting geographic information systemsstore land-use data in grids of rectangularcells (Hsu 1975) However a seriousproblem is encountered with such a regulardiscrete encoding of planar surfaces Ac-cording to the sampling theorem thesampling interval has to be half the size ofthe smallest features to be encoded (egTobler 1969) Hence either the size ofthe cells has to be very small to enable en-coding of detailed variations (eg urbanland uses) or a grosser cell size can be

58 The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

CENSUS ADDRESS CODING GUIDE RECORDS

DIME STREET SEGMENT RECORDS

STREET NODE NODE TRACT BLOCK TRACT BLOCK LOW HIGHSTART END LEFT LEFT RIGHT RIGHT ADDR ADD~

Main 5 6 105 104 1 10Main 6 7 106 103 11 17Main 7 8 101 103 19 28Malt 8 9 101 102 30 42

Fig 3 The DIME-file base Representedis a portion of Tract 1 in a hypothetical city(Cooke and Maxfield 1967) The smallnumbers at middotthe street intersections are thenodes the larger numbers on Main Streetare the addresses Two types of records arccreated as illustrated in the tables

]D-_2 10_3__ ~

]I 105 JD[4 GROVE STREET 3

HIGH ADDRESS

4228

1091741

LOW ADDRESS

3012211119

28

419~0[~

8 ELM STREET 11

BLOCK

102103104105106101

~TRACT ~~

19

CHURCH

101

MainMain

MainMainMainMain

STREET TRACT

system for the Address Coding effort ofthe 1970 census

The basic element of the DIME file is aline segment defined by two end pointsIt is assumed that the segment is straightand not crossed by any other line Themetropolitan files usually define this unitas a street block face Complex lines arerepresented by a series of segments ap-proximating the line The segment hastwo node identifiers along with the co-ordinates of its two end points and codesfor the polygon on each side of the seg-ment

While DIME topology makes much 111-

Structures Using ExplicitTopological Relationships

One of the first known attempts to in-corporate explicit topological structure intoa geographic data base is the Dual Inde-pendent Map Encoding (DIME) systemof the US Bureau of the Census (Fig 3)The DIME files were originally developedas an automated topological error detection

used for encoding only the very slowlychanging variations (generalized landuse) In the case of the small units a verylarge volume of redundant information iscreated in an area of uniform land use andmatrix reduction techniques such as run-length encoding (recording in each rowonly those cells differing from an immedi-ate]y previous cell) create only physicaland not logical compaction (Amidon andAkin 1970) In the case of the grossercell size highly varying land use featuresare aggregated to a degree which reducesthe usefulness of the whole system

Beyond sampling problems a grid struc-ture imposes a bias toward specific orienta-tions of features Diagonals althoughphysically longer are given the apparentcell relationships of unit distances alongthe major axes The grid structure alsocreates the illusion of working with a dis-crete point space rather than a regularareal partitioning

Recent studies have developed param-eters which determine the degree of in-accuracy of a given sampling mesh(Switzer 1975) In cases where sometypes of geographic features vary oversmall areas and others over very largeareas such as the case of statewide land-use patterns the error is averaged over allfeatures and will affect features with ahigh-frequency variation more than others

The variation of the mesh size within theinformation system would be of only littlehelp The data management and the ma-nipulation and display routines would be-come more complex although probably lessthan many researchers imagine Exactfigures about the differences cannot begiven since no literature about such a sys-tem is known to the authors

jmiddotol 2 No1 April 1975 59

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

formation accessible to urban researchersneighborhood relationships are not madeexplicit Segments sharing a node forexample must be found by laborious searchprocedures Search is also required toassemble the outline of a polygon Moreimportantly the DIME structure is cum-hersome to use for many cartographic ap-plications involving areas made up of com-plex lines For procedures in a one-timechecking effort as is the case of AddressCoding and Address Matching in metro-politan areas it is quite adequate Forefficient computer storage and retrievaland for many applications however im-provements are desirable For examplethe reliance on the individual line segmentmakes the reduction of detail for displaypurposes difficult since line segments can-not be simply deleted without correctingthe reference codes for the affected nodes

At the Laboratory for Computer Graph-ics and Spatial Analysis the junior authorhas developed a data structure POLY-VRT that is designed to contain all theinformation needed to construct any of thepreviously enumerated planar structuresThe basic object of POLYVRT is thechain Like a DIME segment a chainhas nodes at its two ends separates twoareas and is assumed to be uncrossed Itdiffers in that the POLYVRT chain maybe made up of many points whereas thebasic DIME unit has only two points Aboundary between two polygons can bereferenced by a single chain no matterhow complicated because line detail istopologically unimportant (Laboratory forComputer Graphics and Spatial Analysis1974)

The coding of a complicated boundaryas a unit is not unique to POLYVRTThe data bank used in the project TheInteractive Map in Urban Research(Nake and Peucker 1972 Peucker (ed)1973) as well as the World Data Bank I(Schmidt 1969) are composed of linesIn the latter case some of them containover 4000 points The chain based systemof POLYVRT however is a differenttype of structure because of the topologicalrole assigned to the chain and the subse-

60

l[ucnt construction of a list data strnctureBased upon this assignment the topologi-cal information about a chain resembles theinformation on a DIME record exceptthat the distinction between nodes (iepoints used for more than one chain) andthe points internal to a chain allows in-ternal points to be eliminated without influencing the neighborhood relationshipsThe main innovation in developing a chainrepresentation is that areas of significantline detail may be efficiently handledTopological checking is reduced from de-pendency on the number of points to de-pendency on the number of boundaries

In addition to the indication of the rela-tive location of the chain with respect toits neighboring polygons POL YVRT in-formation is stored in separate lists assem-bling the bounding chains for every poly-gon Thus searches can take place in twodirections from the chain to the polygonand from the polygon to the chain Thisis very important for any type of neighbor-hood manipulation since neighboring enti-ties can be found through their boundingor bounded complements In otherwords to follow along a group of chainsone flips through chain to polygon to thenext chain etc whereas to traverse aseries of polygons one tests for adjacentpolygons by going through the chain di-rectory for each polygon

The POLYVRT program places pointinformation in secondary storage Thethree higher level objects (chains nodesand polygons) are core resident Onlythe chain refers directly to the point filein the secondary storage In addition toindicating the locations of the points in thepoint file the chain record incorporates thename of the chain the labels of the startingand ending nodes and the left and rightpolYJ~middotons Conversely the polygon listconsists of the bounding chains in propersequence (Figs 4 and 5) A list linkinsnodes to chains could easily be constructed

DATA STRUCTURES FORTHREE-DIMENSION AL SURFACES

Boehm (1967) describes in a very de-tailed analysis the advantages and disad-

The American Carfographrr

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

2 POLYGON 22 NODE

888 CHAIN 3 POINT WITHINCHAIN

Fig 4 The external representation of thePOLYVRT chain-file Every chain has aname the number of inner points (length)the two limiting nodes the two boundarypolygons and a series of coordinates for thepoints NODES

prisingly this topic has produced little dis-cussion despite the fact that extremelylarge data banks of terrain (digital terrainmodels) have been developed

When encoding surfaces it is necessaryto adapt the density of points to the varia-tion in the local terrain The question ofhow dense the points have to be can beanswered by again using the philosophy ofthe sampling theorem For the terrainwithin a typical map the variation canchange considerably resulting in a needfor frequent adjustments of the samplinginterval

In a contour map the density of contourlines changes with the density of reliefvariation Therefore it fulfills the re-quirements set by the sampling theoremfor a non-stationary surface ie a sur-face with changing terrain For the regu-lar grid on the other hand if the smallestobject one wishes to detect anywherewithin a study area is of size (wavelength) 5 then the grid spacing every-where must be 52 or less (note for ex-ample Mark 1974) The regular gridagain tends toward redundancy sincesmooth areas within the study area willcontain far more points than are needed toaccurately portray their form To improvethe resolution of a grid by a factor oft the grid spacing must be decreased by

44 11 3

TO LEFT RIGHT

22 77 0

5

18

x1ylmiddotmiddotmiddotmiddotmiddotXSy5

4 55 77 2

11

111

999

11

Name J

Fig 5 The internal representation of thePOLYVRT chain-file Reference to poly-gon chains are positive or negative accord-ing to the direction of the chain used

vantages of different ways of encoding sur-faces He comes to the conclusion that theencoding of surfaces by contours minimizesthe storage capacity whereas a regulargrid of surface points minimizes the com-puting time necessary for several types ofmanipulations Boehm did not include inhis study a data system with an irregulardistribution of points but only contour en-codings and regular grid structures of con-stant and variable mesh width He didnot reflect on the reasons why contoureddata minimize storage capacity whereas aregular grid minimizes computing time Ifhe had done so it is quite possible that thedevelopment of geographic data baseswould have taken different routes Sur-

22 X Y

Y

77 Y

CHAINSName Poinls from to Ltlll Right

888 22 55

222 55 77

77

POINTS

Secondary Storagex y detail

POLYGONSUst Point Namemiddotm

al 2 No1 April 1975 61

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

the reciprocal of this factor and the totalnumber of points is increased by p

The question has been raised many timesin photogral11t11etrywhether contours arethe best representatives of topographicsurfaces It has been stated that contoursdo not detect many types of breaks whichare frequent on terrain (eg Brandstatter1957) Therefore the encoding of sur-faces by vertical profiles has beenattemptedmiddot many times in photogrammetryin recent years Points are encoded onlywhen the slope of the surface changes(eg Kraus 1973) In another case thebreak lines are encoded in addition to aregular grid (Sima 1972) and in anotherapproach only the break lines or only theridges and channel lines of a surface areencoded (Grist 1972) The amount ofdetail with these approaches depends onthe scale used

vVhen performing numerical computa-tions on the basis of these digital terrainmodels the quantity of data involved willbe only one determining factor for theamount of programming and computationsneeded 11ost of the numerical computa-tions on surfaces require some type ofneighborhood function either to computesome surface behavior such as slope orlocal variation of relief or to find the nextunit for the drawing of a contour or avertical profile For a set of contours itis relatively easy to create a directorywhich indicates a sequence of contours ina type of tree in which the surroundingcontour is the base and the other contoursare the branches (Morse 1968) To findthe neighboring points on the two ad-jacent contours for a given point on anintermediate contour however one mustsearch through all the points on the twoadjacent contours and compute the dis-tances to the point in question This pro-cedure is time-consuming if the contourlines are represented by very many pointsA regular grid on the other hand has animplicit neighborhood function and findinga neighbor does not involve search norextra computer time For a set of irregu-larly-distributed points as they are repre-sented in very simple data structures (eg

62

SYMAP) the creation of a neighborhoodfunction is usually performed by findingthe closest small number of points wherethe number varies around six

THE GENERAL CASE

I t has been noted that although planesurfaces and three-dimensional surfaceslook rather different it takes only a fewassumptions to treat one as the subset ofthe other While three-dimensional sur-faces are always based on interval or ratiodata planar surfaces often involve ordinaland nominal data However it has beenshown that ordinal and nominal data maybe treated as interval data (N ordbeck andRystedt 1970 Rosenfeld 1969) and evenwithout this conversion we can combinethe two types into one general case bysimply using different assumptions aboutneighborhood

Given a set of n data points for whichx y and z coordinates are known con-tinuously defined surfaces (ie surfacesfor which one and only one value existsat every point) can be created using anyof three different assumptions about thesurface behavior (Peucker 1972) Thefirst assumption is that of a stepped sur-face which says that the surface retainsthe value of a data point within the neigh-borhood of that data point where neigh-borhood is defined either by a given poly-gon (the choropleth approach) or by thefact that the area is closer to one datapoint than to any other one in the neighbor-hood (the proximal approach) The sec-ond assumption is that each data pointrepresents a sample of a single value on aconstantly changing surface Neighbor-hood is then a number of closest neighbor-ing data points and intervening values areinterpolated with different types of in-terpolation procedures The third assump-tion is that the data point is a sample froma constantly changing surface that maycontain errors thus the data point is notnecessarily located on the surface butclose to it This approach implies thefurther assumption that the actual surfaceis smoother than the surface constructedthrough the sample data points

The American CartograPher

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

linFeatures

ChainGroup

vVith these assumptions about surfacebehavior any point or areal distributionof a variable can be treated as a continuousfunction z = f (x y) and planar and three-dimensional surfaces can be combined intoone type This has already been done insome computer progranls the most notablebeing SYMAP Many cartographic ap-plications must treat both types of sur-faces and therefore data structures mustbe developed which can handle both typesat one time

THE PROPOSED DATA STRUCTURES

Planar Surfaces

The need to incorporate different typesor hierarchies of polygons uncovered oneof the limiting assumptions made inPOLYVRT A chain plays a dual rolefirst it is the boundary of two areal enti-ties and secondly it is the unbroken unitof point retrieval This is not a problemif one is limited to a single nonoverlappingset of polygons The following proposednew system to bear the name GEOGRAFis an attempt at greater flexibility

Because of the addition of many layersof complexity involving multiple polygonsets the chain cannot remain to the samedegree the controlling object of the datastructure Just as the notion of an un-broken line is important so is the notionof an unpartitioned space In a systemwhich must handle overlapping polygonnetworks there is a need for a root ob-ject which is defined as an area uncut byany further partitioning This object istermed the Least Common GeographicUnit (LCGU) The LCGUs are con-structed as a POLYVRT polygon directlyfrom chains The relationship of theLCGDs to all other polygon types is hi-erarchical (Fig 6)

In turn the existence of the LCGUallows for the creation of each class ofpolygon In order to allow simple codingof the boundary relationships at thesehigher levels in the structure the chaingroup was devised A chain group is aset of chains which form a boundary of twoareal units for a given polygon classThese polygons are constructed of chain

0 2 No1 APril 1975

Cha DGroup === LiI I

EJ 8ha0ENodes -- -- LCGU 0

- - -LJ ~ 1

RFig 6 The relationship between the vari-ous parts of the data structure

groups which in turn are constructedfrom chains

Line features can be built up of chainsin the same manner as chain groupsNote that the chain group listing for eachlevel of polygon and each listing for a linefeature reference only the chains them-selves This allows each system to beconsidered as a separate directory whichis core-resident only when that class ofobjects is retrieved

The LCGU has other implications andapplications that are useful because of thetopological data structure The LCGUwith its coding for each of the polygonsets can be combined with contiguity in-formation of linear feature types to producean Attribute Cross Reference (A CR) The ACR is a table in which all objects(in polygon and linear systems) are cross-referenced to each other to determinenesting By using attributes of chains(lengths) and LCGUs (areas populationdensities etc) this cross-referencing cap-ability can assign a string of data collectedfor one polygon type to a string of asecond type

Topological manipulation routines arecentral to the success of this structureThe intersection of geographic featureswill rely on topological knowledge to re-alize economies of scale in processing largefiles All operations with lines will actu-ally work with bands built with endpoints

63

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of the line and the furthest deviants toboth sides (if the bands become too widethe lines are split etc) With thisapproach the windowing process uses non-linear windows (which is often the casewith geographic coordinates and map pro-jections) and becomes quite elegant Sim-ilarly intersection procedures such aspoint-in-polygon line-across-polygon andpolygon-over-polygon determination allowfor gainful application of the topologicalprinciple For point-in-polygon searcheschain groups are constructed which bisectthe universe into parts to sort the points(or nodes of polygons) into three groupsleft within the band and right Only thesecond group needs more detailed treat-ment The point set is then recursivelypartitioned until it reaches the level of theLeGDs For line-oriented problems thetopological connections of the two sets com-pared would allow intersections to belimited to immediate neighbors

Another important procedure will be anested chain-intersection routine Hereagain the chain band and its recursivesegmentation is used The number ofpoints which define a chain is constantlyincreased until the intersection test can bedetermined The search of a line througha set of polygons will use a graph-searchalgorithm developed for the GDS (Geo-graphic Data Structure) project discussedin the next section The neighborhoodsearch routines create records of neighborsfor every point or line or polygon at anyspecified depth of neighborhood

Three-Dimensional Surfaces

To implement the ideas presented herethe senior author is developing a geo-graphic information system for three-dimensional surfaces Both systems thethree-dimensional and the planar arebased on data structures with explicittopological neighborhood relationships

The basic philosophy of both approachesis to separate the data base from the appli-cation programs In the early days ofcomputer cartography a data set wasgenerally tied to the application programthat used it Now the available data have

64

become so voluminous and the applicationprograms so varied that extra efforts inthe preparation of data for more efficientcomputations seem to be justified We arewell in step with modern computer scienceto separate the data base from the applica-tion programs with the data structure be-coming the link between the two

The creation of a structured data baseis nothing new The interpolation of anirregular grid to a regular grid of height-points provides a data structure throughthe implicit neighborhood function of thegrid And in the case where polygons areindependently defined by a series of pointsthe neighborhood relationship is replacedby a search algorithm which finds theneighbors for every polygon by searchingfor matching segments in the boundaryfiles of the other polygons The idea is tospend computing time before any applica-tion has been performed in the anticipa-tion of heavy uses of the data base Anefficient structure of the data base shouldnot only speed up computations consider-ably but should also simplify the produc-tion of application programs

The data structure developed under theworking title Geographic Data Structure(GDS) is based on irregularly distributedpoints which are assumed to be samplepoints without sampling errors from asingle-valued surface Two types of struc-tures form the core of the data bases Thefirst creates neighborhood relationships bytriangulating the data set and storing forevery point the labels of all points whichare linked with the point by a triangleedge (Fig 7) The second structure isproduced by selecting those points of thesurfaces which lie along lines of high in-formation content such as ridges andchannel lines and defining them by theirnodes which are peaks passes and pits(Fig 8) This second data structureserves two purposes First ~t is a generalrepresentation of the surface for roughcomputations second it is a directoryinto the more detailed first structure

In the first structure the creation of theneighborhood relationship is based on theassumption that data-sets are usually of

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

34 16 3S 53 52 33 15

11 4 18 36 35 16 38

CHAINS

POINTE~

NODES

Fig 8 The GDS-second data structureBoth the node-file and the chain-file haveaccess to the first data structure the node-file directly and the chain-file through achain-painter-file

Gottschalk 1970) creates all possible linkschooses the shortest and eliminates alllinks which intersect the shortest Thisprocedure is repeated with the next short-est links until no links intersect The re-sult is the set of links with the minimumcumulative distance between neighboringpoints

The procedure has one disadvantagesince (~ ) links have to be created the num-ber of points is therefore limited toonly several hundred The first step inour approach therefore limits the links toa number of potential neighbors amongwhich the shortest link is chosen and in-tersected with all other links originatingin these potential neighbors This pro-cedure limits the number of tests for in-

nm2

tersections of links to less than -4-where m is the number of potential neigh-bors an arbitrary number between 8 and14 depending on the density variation ofthese points The procedure does notguarantee however that only triangles areconstructed polygons with more thanthree sides can result although they arerelatively rare The check for such poly-gons and their elimination is very easy andfast

The second possibility is to create atriangulated structure through use ofThiessen polygons A published solution(Rhynsburger 1973) intersects for everypoint the links to every other point midwayand chooses the smallest polygon created

~COIDJ-OIIIJI II POINTS POINTERS CHAIN POINTERSL Nm bull

[[]]-DI ~~u =

317 35 34 1516

Points Pcinters

POINTS POINTERSName x y bull Pointers

r--1735

~ i34

6 X Y z -- 15 -~ 17 X Y z i+6 23

18 X Y z i+13

~

-bull183635- 16

3 X Y z j 3

~

~l

----

NEIGHBORHOOD RELATIONSHIPS

TRIANGULATED POINTS

Fig 7 The GDS-first data structure Theillustration shows the points on the surfacewith their links to neighbors (edges) Theexternal representation is shown by theneighborhood relationships The internalrepresentation is composed of the point-fileand the pointer-file

two types (a) sets of irregularly-distri-buted points which were digitized with theunderstanding that every point is signif-icant and (b) sets of regularly or irregu-larly-distributed points where it is knownthat a number of points are redundant andcan be eliminated from the set eg regulargrids of points and encoded contours

The first type of data set is linked bysome type of triangulation At least twoapproaches exist The first (Dueppe and

Vol 2 No1 April 1975 65

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

by the perpendiculars Every point whichcontributes to the Thiessen polygon is aThiessen neighbor This procedure canagain be simplified by the assumption of alimited set of potential neighbors Thesame checking routines as above have tobe applied

Another approach which limits the num-ber of necessary tests but is mathemat-ically correct at the same time has beendeveloped within the project by KurtBrassel at Harvard University The pro-cedure is based on fields of potentialneighbors which converge very rapidly

The alternative to triangulation is three-dimensional generalization ie to selectfrom a set of points those which define thestructure with the least deviation from theoriginal surface The basic concept de-veloped by Randolph Franklin of HarvardUniversity and T K Peucker is to ap-proximate the surface of a series of tri-angles through a selected set of pointswhere each additional point included intothe set is the one which deviates the mostfrom the approximated triangles until thedeviations are below a given value (seePeucker 1974)

In both cases the triangulation and gen-eralization of the surface the result is alinked list of surface points The termlinked list means that points are linked withone another through pointers In otherwords a point is not only identified by its y z coordinates but also by a list of thelabels of the points which form edges oftriangles with the point In our case eachrecord consists of the t y z coordinates ofa point and a reference to the start of thepointers to the neighbors in a pointer listThe reason for not having the neighbor-hood pointers with the point record is thatthe number of points varies considerably(Mark 1974) Since the record has to belong enough to include all possible num-bers of neighbors large parts of the pointersections would be empty most of the timeThe pointers are sorted starting with thepointer the least East of North of a point(Fig 7)

The use of this type of data-structure isvery simple and efficient For every search

66

(profile contour etc) a criterion for edge-intersection is developed For contouringfor example the critical question iswhether one point of the edge is above thecontour level and the other below A startis found and one point of the edge is con-sidered a reference point and the other asubpoint The next subpoint is found bylooking up the next neighbor in the pointerlist If the test is positive the intersectionis performed and the process repeated Ifthe test is negative the reference and sub-points are switched and the process re-peated

Other procedures are equally simpleTo find a triangle for example one hasonly to have a reference and a subpointThe third point is the next label in thepointer list of the reference point after thesubpoint To find all triangles one goesthrough the total pointer list leaving outall those edges connecting reference pointswith subpoints with a smaller label sincethey would create triangles which hadbeen treated when the subpoint was areference point

Although the first data structure aspresented above seems to be efficient interms of storage capacity it does not pro-vide easy access to the data base which isoften very large I t is for this reasonthat we are developing the second datastructure to represent the general structureof the surface and to serve as a directoryto reach into the first data base (Pfaltz1975) (Fig 8)

The first step in the creation of thesecond data structure is to find the ridgechannel and break lines on the surface Ifone labels the highest point for every tri-angle the unlabeled points are membersof the channel line at least on smoothsurfaces A subsequent search routinedeals with the irregularities Points alongridges are found by eliminating the lowestpoint of every triangle using a routinedeveloped for a regular grid by D MDouglas The detection of break lines issomewhat more difficult

Some theoretical studies of surfaces byWarntz (1966) show that ridge lines andchannel lines cross at passes a useful point

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of information relative to the developmentof the second data structure Practicalconsiderations suggest that in terrain andother surfaces this regularity is not alwayspresent

Once the topological structure is at handit can be used as a directory into the firststructure The line is therefore treatedas a chain similar to the chain of thePOLYVRT system The nodes of thechain are the peaks passes pits and otherendpoints of chains on the surfaces Thesepoints are stored with their coordinatesand the names of the chains which termi-nate at the nodes The chains are storedwith the labels of the nodes and pointersinto the chain lists which consist of labelsof points in the first data structure (notethat here the chain structure differs fromthat of POLYVRT)

A third component of the GeographicData Structure should be mentioned sinceit illustrates very well the logical adapta-tion of a computer problem solution togeographical data The problem at handis the partitioning of the data set Sincewith large data sets only portions can bekept in fast memory the data base is seg-mented into pages which are broughtinto memory as units For the Geo-graphic Data Structure the paging sys-tem can solve several problems inherentin a complex geographic information sys-tem

The boundaries of patches as we callthe areal extension of a page are chainsalready defined for the second structureSince detail along the chain is of no topo-logical interest the density of points alongthe chain can differ for its two sides Inother words the density of triangles canchange from patch to patch This allowsfor very efficient data encoding even interrain with sudden changes in the sur-face behavior as at the change from amountainous area into a plain (Peucker1972)

Another advantage of the paging-systemis the ease of including topographic andplanar information Linking point lineand areal data to the triangulated pointswould lead to high definitional redundancy

Vol 2 No1 APril 1975

The secondary structure could lead lttoam-biguities where the terrain is very elon-gated Since an attempt has been made tokeep the shape of the patches as compactas possible the combination of non-terraindata with patch boundaries seems to bemost appropriate

Since the patch boundaries are againchains another virtue comes to light Thepatches can be treated as polygons of thePOLYVRT system with little difficultyThis link between the two systems lendshope that eventually they may be merged

It is an appropriate question to ask whatsuch a data structure as the GDS will beable to accomplish A number of displayroutines have already been developed(Cochrane 1974) and a series of proce-dures for surface analysis based on heuris-tic searches are underway (Fig 9) Sinceboth levels of data structure are graphswe will be able to rely on many of the de-velopments connected with operations re-search specifically network analysis forthe manipulative treatment of the data

As both systems GDS and GEOGRAFhave topological structures it is possibleto merge the two The creation of poly-gons from points is the major link fromthe GDS project to GEOGRAF Thecreation of a set of centroids for polygonsallows the conversion in the opposite direc-tion This way surfaces can be treated aspolygonal sets and can be displayed andmanipulated by the routines of GEO-GRAF Conversely polygonal data canbe treated as surfaces for GDS Theneighborhood routines are what make theproject useful in quantitative geographyand planning Neighborhood searches areextremely expensive without the topologi-cal data structure but they are usually amost important part of urban and environ-mental analyses once a general overview isobtained from the data

Although basic research and applicationdevelopment are two sides of one coin andmust go together Ito obtain lasting resultsthis paper has concentrated on the theo-retical parts of the project since their de-velopment is ahead of the application rou-tines a fact which should be expected

67

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

REGULAR GRID

~GENERALIZATION

TRIANGULATEDIRREGULAR GRID

First Data Structure

jNETWORK PROCEDURES

THEORy OFSURFACE SPECIFICPOINTS AND LINES

HIURfSTIC SEARCH

SEARCH

Profile SearchGeographic DisorientationOptimum SearchNeighborhood An~lysis

ContouringBlock DiagramShaded ContoursInclined ContoursVisibility TestRadar Image

Fig 9 The problem flow for GDS from the data base via the extraction of points andchains for the two types of data structures to their application

The quintessence of the research so far isthe hypothesis that topologically-structureddata bases of three-dimensional and planarsurfaces can result in reduced efforts inthe development and execution of applka-

REFERENCESAmidon E L and G S Aiken (1970) Al-

gorithmic Selection of the Best Method forCompressing Map Data Strings Communica-tions of the Association for Computillg Ma-chinery Vol 14 pp 769-774

68

tion routines We have some indicationthat the hypothesis is correct the real testwill come when the bulk of the applicationroutines is completed

Arms A (1970) MAPMODEL System Sys-tem Descriptioll and Users Guide Bureau ofGovernmental Research and Service Univer-sity of Oregon 59 pp

Boehm B M (1967) Tabular Representation

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of Multivariate Functions with Applicationsto Topographic Modelling Proceedings Z2ndNational Conference Association for Comput-ing Machinery pp 403-415

Brandstaetter L (1957) Exakte Schichtlinienund Topographische GelaendedarstellungOesterreichische Zeitschrift fuer VermesSlmgs-wesepound Sonderheft 18 Wien 90 pp

Cochrane Douglas (1974) Cartographic Dis-play Operations on Surface Data Collected inan Irregular Structure MA thesis SimonFraser University

Cooke D F and W F Maxfield (1967) TheDevelopment of a Geographic Base File andIts Uses for Mapping Urban and RegionalInformation Systems for Social Programs Pa-pers from the 5th Annual Conference of theUrban and Regional Information System As-sociation pp 207-218

Douglas David H (1973) BNDRYNETPeucker T K (ed) The Interactive Map inUrban Research Final Report After Year OneUniversity of British Columbia

Dueppe R D and H J Gottschalk (1970)Automatische Interpolation von Isolinien beiwillkuerlich verteilten Stuetzpunkten Allge-meine Vermessungs-Nachrichten Vol 129 No10 (Oct) pp 423-426

Grist M W (1972) Digital Ground ModelsAn Account of Recent Research Photogram-metric Record Vol 70 No4 (Oct) pp 424-441

Heiskanen W A and H Moritz (1967) Phys-ical Geodesy W H Freeman San FranciscoChapter 7

Holroyd M T and B K Bhattacharyya(1970) Automatic CO1poundtouring of GeoPhysicalData Using Riclbic Spline Interpolation Geo-logical Survey of Canada Paper No 70-55

Hsu M L et al (1975) Computer Applica-tions in Land Use Mapping and the MinnesotaLand Management Information System Da-vis ] C and M McCullagh (eds) Displayand Analysis of Spatial Data New York JohnWiley and Sons

Junkins ] L G M Miller and J R Jancaitis(1973) A Weighting Function Approach toModelling of Irregular Surfaces Journal ofGeophysical Research Vol 78 No 11 (Apr)pp 1794-1803

Kraus K (1973) A General Digital TerrainModel translation from an article in Acker-mann F (1973) Numerische Photogram-metrie Sammlung Wiechmann Neue FolgeBuchreihe Bd 5 Karlsruhe

Laboratory for Computer Graphics and Spatial

Vol 2 No1 April 1975

Analysis Harvard University (1974) POLY-VRT Manual Cambridge Mass

Mark David M (1974) A Comparison ofComputer-based Terrain Storage MethodsWith Respect to Evaluation of Certain Geo-morphometric Measures MA thesis Uni-versity of British Columbia

Morse S P (1968) A Mathematical Modelfor the Analysis of Contouring Line DataJournal of the Association for Computing Ma-chinery Vol 15 No2 pp 205-220

Nake F and T K Peucker (1972) The Inter-active Map in Urban Research Report afterYear One University of British Columbia

Nordbeck S and B Rystedt (1970) IsarithmicMaps and the Continuity of Reference IntervalFunctions Geografiska Annaler Vol 52 SerB pp 92-123

Peucker T K (1972) Computer CartographyAssociation of American Geographers CollegeGeography Commission Resource Paper No17 Washington DC

Peucker T K (ed) (1973) The InteractiveMap in Urban Research Final Report AfterYear One University of British Columbia

Peucker T K (1974) GeograPhical DataStructltres Report After Year One SimonFraser University

Pfaltz J L (1975) Surface Networks Geo-graPhical Analysis (forthcoming)

Rhynsburger D (1973) Analytic Delineationof Thiessen Polygons Geographical AnalysisVol 5 No2 (Apr) pp 133-144

Rosenfeld A (1969) Picture Processing byComputer New York Academic Press

Schmidt W (1969) The Automap SystemSurveying and Mapping Vol 29 No 1(Mar) pp 101-106

Shepard D (1968) A Two-Dimensional Inter-polation Function for Irregularly SpacedData Proceedings 23rd National ConferenceAssociation for Computing Machinery pp 517-524

Sima J (1972) Prinzipien des CS digitalenGelaendemodells Vermessungstechnik Vol20 No2 pp 48-51

Switzer P (1975) in Sampling of PlanarSurfaces Davis J C and M McCullagh(eds) DisPlay and Analysis of Spatial DataJohn Wiley and Sons New York

Tobler W R (1969) Geographical Filters andtheir Inverses GeograPhical Analysis Vol 1pp 234-253

Warntz W (1966) The Topology of Socio-Economic Terrain and Spatial Flows Papersof the Regional Science Association Vol 17pp47-61 bull

69

Page 5: +>KMIBK>JCD?,>M> :MKN?MNKALkclarke/Geography232/PeuckerChrisman.pdfDelivered by Publishing Technology to: University of California Santa Barbara Library IP: 128.111.234.232 on: Tue,

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

CENSUS ADDRESS CODING GUIDE RECORDS

DIME STREET SEGMENT RECORDS

STREET NODE NODE TRACT BLOCK TRACT BLOCK LOW HIGHSTART END LEFT LEFT RIGHT RIGHT ADDR ADD~

Main 5 6 105 104 1 10Main 6 7 106 103 11 17Main 7 8 101 103 19 28Malt 8 9 101 102 30 42

Fig 3 The DIME-file base Representedis a portion of Tract 1 in a hypothetical city(Cooke and Maxfield 1967) The smallnumbers at middotthe street intersections are thenodes the larger numbers on Main Streetare the addresses Two types of records arccreated as illustrated in the tables

]D-_2 10_3__ ~

]I 105 JD[4 GROVE STREET 3

HIGH ADDRESS

4228

1091741

LOW ADDRESS

3012211119

28

419~0[~

8 ELM STREET 11

BLOCK

102103104105106101

~TRACT ~~

19

CHURCH

101

MainMain

MainMainMainMain

STREET TRACT

system for the Address Coding effort ofthe 1970 census

The basic element of the DIME file is aline segment defined by two end pointsIt is assumed that the segment is straightand not crossed by any other line Themetropolitan files usually define this unitas a street block face Complex lines arerepresented by a series of segments ap-proximating the line The segment hastwo node identifiers along with the co-ordinates of its two end points and codesfor the polygon on each side of the seg-ment

While DIME topology makes much 111-

Structures Using ExplicitTopological Relationships

One of the first known attempts to in-corporate explicit topological structure intoa geographic data base is the Dual Inde-pendent Map Encoding (DIME) systemof the US Bureau of the Census (Fig 3)The DIME files were originally developedas an automated topological error detection

used for encoding only the very slowlychanging variations (generalized landuse) In the case of the small units a verylarge volume of redundant information iscreated in an area of uniform land use andmatrix reduction techniques such as run-length encoding (recording in each rowonly those cells differing from an immedi-ate]y previous cell) create only physicaland not logical compaction (Amidon andAkin 1970) In the case of the grossercell size highly varying land use featuresare aggregated to a degree which reducesthe usefulness of the whole system

Beyond sampling problems a grid struc-ture imposes a bias toward specific orienta-tions of features Diagonals althoughphysically longer are given the apparentcell relationships of unit distances alongthe major axes The grid structure alsocreates the illusion of working with a dis-crete point space rather than a regularareal partitioning

Recent studies have developed param-eters which determine the degree of in-accuracy of a given sampling mesh(Switzer 1975) In cases where sometypes of geographic features vary oversmall areas and others over very largeareas such as the case of statewide land-use patterns the error is averaged over allfeatures and will affect features with ahigh-frequency variation more than others

The variation of the mesh size within theinformation system would be of only littlehelp The data management and the ma-nipulation and display routines would be-come more complex although probably lessthan many researchers imagine Exactfigures about the differences cannot begiven since no literature about such a sys-tem is known to the authors

jmiddotol 2 No1 April 1975 59

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

formation accessible to urban researchersneighborhood relationships are not madeexplicit Segments sharing a node forexample must be found by laborious searchprocedures Search is also required toassemble the outline of a polygon Moreimportantly the DIME structure is cum-hersome to use for many cartographic ap-plications involving areas made up of com-plex lines For procedures in a one-timechecking effort as is the case of AddressCoding and Address Matching in metro-politan areas it is quite adequate Forefficient computer storage and retrievaland for many applications however im-provements are desirable For examplethe reliance on the individual line segmentmakes the reduction of detail for displaypurposes difficult since line segments can-not be simply deleted without correctingthe reference codes for the affected nodes

At the Laboratory for Computer Graph-ics and Spatial Analysis the junior authorhas developed a data structure POLY-VRT that is designed to contain all theinformation needed to construct any of thepreviously enumerated planar structuresThe basic object of POLYVRT is thechain Like a DIME segment a chainhas nodes at its two ends separates twoareas and is assumed to be uncrossed Itdiffers in that the POLYVRT chain maybe made up of many points whereas thebasic DIME unit has only two points Aboundary between two polygons can bereferenced by a single chain no matterhow complicated because line detail istopologically unimportant (Laboratory forComputer Graphics and Spatial Analysis1974)

The coding of a complicated boundaryas a unit is not unique to POLYVRTThe data bank used in the project TheInteractive Map in Urban Research(Nake and Peucker 1972 Peucker (ed)1973) as well as the World Data Bank I(Schmidt 1969) are composed of linesIn the latter case some of them containover 4000 points The chain based systemof POLYVRT however is a differenttype of structure because of the topologicalrole assigned to the chain and the subse-

60

l[ucnt construction of a list data strnctureBased upon this assignment the topologi-cal information about a chain resembles theinformation on a DIME record exceptthat the distinction between nodes (iepoints used for more than one chain) andthe points internal to a chain allows in-ternal points to be eliminated without influencing the neighborhood relationshipsThe main innovation in developing a chainrepresentation is that areas of significantline detail may be efficiently handledTopological checking is reduced from de-pendency on the number of points to de-pendency on the number of boundaries

In addition to the indication of the rela-tive location of the chain with respect toits neighboring polygons POL YVRT in-formation is stored in separate lists assem-bling the bounding chains for every poly-gon Thus searches can take place in twodirections from the chain to the polygonand from the polygon to the chain Thisis very important for any type of neighbor-hood manipulation since neighboring enti-ties can be found through their boundingor bounded complements In otherwords to follow along a group of chainsone flips through chain to polygon to thenext chain etc whereas to traverse aseries of polygons one tests for adjacentpolygons by going through the chain di-rectory for each polygon

The POLYVRT program places pointinformation in secondary storage Thethree higher level objects (chains nodesand polygons) are core resident Onlythe chain refers directly to the point filein the secondary storage In addition toindicating the locations of the points in thepoint file the chain record incorporates thename of the chain the labels of the startingand ending nodes and the left and rightpolYJ~middotons Conversely the polygon listconsists of the bounding chains in propersequence (Figs 4 and 5) A list linkinsnodes to chains could easily be constructed

DATA STRUCTURES FORTHREE-DIMENSION AL SURFACES

Boehm (1967) describes in a very de-tailed analysis the advantages and disad-

The American Carfographrr

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

2 POLYGON 22 NODE

888 CHAIN 3 POINT WITHINCHAIN

Fig 4 The external representation of thePOLYVRT chain-file Every chain has aname the number of inner points (length)the two limiting nodes the two boundarypolygons and a series of coordinates for thepoints NODES

prisingly this topic has produced little dis-cussion despite the fact that extremelylarge data banks of terrain (digital terrainmodels) have been developed

When encoding surfaces it is necessaryto adapt the density of points to the varia-tion in the local terrain The question ofhow dense the points have to be can beanswered by again using the philosophy ofthe sampling theorem For the terrainwithin a typical map the variation canchange considerably resulting in a needfor frequent adjustments of the samplinginterval

In a contour map the density of contourlines changes with the density of reliefvariation Therefore it fulfills the re-quirements set by the sampling theoremfor a non-stationary surface ie a sur-face with changing terrain For the regu-lar grid on the other hand if the smallestobject one wishes to detect anywherewithin a study area is of size (wavelength) 5 then the grid spacing every-where must be 52 or less (note for ex-ample Mark 1974) The regular gridagain tends toward redundancy sincesmooth areas within the study area willcontain far more points than are needed toaccurately portray their form To improvethe resolution of a grid by a factor oft the grid spacing must be decreased by

44 11 3

TO LEFT RIGHT

22 77 0

5

18

x1ylmiddotmiddotmiddotmiddotmiddotXSy5

4 55 77 2

11

111

999

11

Name J

Fig 5 The internal representation of thePOLYVRT chain-file Reference to poly-gon chains are positive or negative accord-ing to the direction of the chain used

vantages of different ways of encoding sur-faces He comes to the conclusion that theencoding of surfaces by contours minimizesthe storage capacity whereas a regulargrid of surface points minimizes the com-puting time necessary for several types ofmanipulations Boehm did not include inhis study a data system with an irregulardistribution of points but only contour en-codings and regular grid structures of con-stant and variable mesh width He didnot reflect on the reasons why contoureddata minimize storage capacity whereas aregular grid minimizes computing time Ifhe had done so it is quite possible that thedevelopment of geographic data baseswould have taken different routes Sur-

22 X Y

Y

77 Y

CHAINSName Poinls from to Ltlll Right

888 22 55

222 55 77

77

POINTS

Secondary Storagex y detail

POLYGONSUst Point Namemiddotm

al 2 No1 April 1975 61

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

the reciprocal of this factor and the totalnumber of points is increased by p

The question has been raised many timesin photogral11t11etrywhether contours arethe best representatives of topographicsurfaces It has been stated that contoursdo not detect many types of breaks whichare frequent on terrain (eg Brandstatter1957) Therefore the encoding of sur-faces by vertical profiles has beenattemptedmiddot many times in photogrammetryin recent years Points are encoded onlywhen the slope of the surface changes(eg Kraus 1973) In another case thebreak lines are encoded in addition to aregular grid (Sima 1972) and in anotherapproach only the break lines or only theridges and channel lines of a surface areencoded (Grist 1972) The amount ofdetail with these approaches depends onthe scale used

vVhen performing numerical computa-tions on the basis of these digital terrainmodels the quantity of data involved willbe only one determining factor for theamount of programming and computationsneeded 11ost of the numerical computa-tions on surfaces require some type ofneighborhood function either to computesome surface behavior such as slope orlocal variation of relief or to find the nextunit for the drawing of a contour or avertical profile For a set of contours itis relatively easy to create a directorywhich indicates a sequence of contours ina type of tree in which the surroundingcontour is the base and the other contoursare the branches (Morse 1968) To findthe neighboring points on the two ad-jacent contours for a given point on anintermediate contour however one mustsearch through all the points on the twoadjacent contours and compute the dis-tances to the point in question This pro-cedure is time-consuming if the contourlines are represented by very many pointsA regular grid on the other hand has animplicit neighborhood function and findinga neighbor does not involve search norextra computer time For a set of irregu-larly-distributed points as they are repre-sented in very simple data structures (eg

62

SYMAP) the creation of a neighborhoodfunction is usually performed by findingthe closest small number of points wherethe number varies around six

THE GENERAL CASE

I t has been noted that although planesurfaces and three-dimensional surfaceslook rather different it takes only a fewassumptions to treat one as the subset ofthe other While three-dimensional sur-faces are always based on interval or ratiodata planar surfaces often involve ordinaland nominal data However it has beenshown that ordinal and nominal data maybe treated as interval data (N ordbeck andRystedt 1970 Rosenfeld 1969) and evenwithout this conversion we can combinethe two types into one general case bysimply using different assumptions aboutneighborhood

Given a set of n data points for whichx y and z coordinates are known con-tinuously defined surfaces (ie surfacesfor which one and only one value existsat every point) can be created using anyof three different assumptions about thesurface behavior (Peucker 1972) Thefirst assumption is that of a stepped sur-face which says that the surface retainsthe value of a data point within the neigh-borhood of that data point where neigh-borhood is defined either by a given poly-gon (the choropleth approach) or by thefact that the area is closer to one datapoint than to any other one in the neighbor-hood (the proximal approach) The sec-ond assumption is that each data pointrepresents a sample of a single value on aconstantly changing surface Neighbor-hood is then a number of closest neighbor-ing data points and intervening values areinterpolated with different types of in-terpolation procedures The third assump-tion is that the data point is a sample froma constantly changing surface that maycontain errors thus the data point is notnecessarily located on the surface butclose to it This approach implies thefurther assumption that the actual surfaceis smoother than the surface constructedthrough the sample data points

The American CartograPher

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

linFeatures

ChainGroup

vVith these assumptions about surfacebehavior any point or areal distributionof a variable can be treated as a continuousfunction z = f (x y) and planar and three-dimensional surfaces can be combined intoone type This has already been done insome computer progranls the most notablebeing SYMAP Many cartographic ap-plications must treat both types of sur-faces and therefore data structures mustbe developed which can handle both typesat one time

THE PROPOSED DATA STRUCTURES

Planar Surfaces

The need to incorporate different typesor hierarchies of polygons uncovered oneof the limiting assumptions made inPOLYVRT A chain plays a dual rolefirst it is the boundary of two areal enti-ties and secondly it is the unbroken unitof point retrieval This is not a problemif one is limited to a single nonoverlappingset of polygons The following proposednew system to bear the name GEOGRAFis an attempt at greater flexibility

Because of the addition of many layersof complexity involving multiple polygonsets the chain cannot remain to the samedegree the controlling object of the datastructure Just as the notion of an un-broken line is important so is the notionof an unpartitioned space In a systemwhich must handle overlapping polygonnetworks there is a need for a root ob-ject which is defined as an area uncut byany further partitioning This object istermed the Least Common GeographicUnit (LCGU) The LCGUs are con-structed as a POLYVRT polygon directlyfrom chains The relationship of theLCGDs to all other polygon types is hi-erarchical (Fig 6)

In turn the existence of the LCGUallows for the creation of each class ofpolygon In order to allow simple codingof the boundary relationships at thesehigher levels in the structure the chaingroup was devised A chain group is aset of chains which form a boundary of twoareal units for a given polygon classThese polygons are constructed of chain

0 2 No1 APril 1975

Cha DGroup === LiI I

EJ 8ha0ENodes -- -- LCGU 0

- - -LJ ~ 1

RFig 6 The relationship between the vari-ous parts of the data structure

groups which in turn are constructedfrom chains

Line features can be built up of chainsin the same manner as chain groupsNote that the chain group listing for eachlevel of polygon and each listing for a linefeature reference only the chains them-selves This allows each system to beconsidered as a separate directory whichis core-resident only when that class ofobjects is retrieved

The LCGU has other implications andapplications that are useful because of thetopological data structure The LCGUwith its coding for each of the polygonsets can be combined with contiguity in-formation of linear feature types to producean Attribute Cross Reference (A CR) The ACR is a table in which all objects(in polygon and linear systems) are cross-referenced to each other to determinenesting By using attributes of chains(lengths) and LCGUs (areas populationdensities etc) this cross-referencing cap-ability can assign a string of data collectedfor one polygon type to a string of asecond type

Topological manipulation routines arecentral to the success of this structureThe intersection of geographic featureswill rely on topological knowledge to re-alize economies of scale in processing largefiles All operations with lines will actu-ally work with bands built with endpoints

63

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of the line and the furthest deviants toboth sides (if the bands become too widethe lines are split etc) With thisapproach the windowing process uses non-linear windows (which is often the casewith geographic coordinates and map pro-jections) and becomes quite elegant Sim-ilarly intersection procedures such aspoint-in-polygon line-across-polygon andpolygon-over-polygon determination allowfor gainful application of the topologicalprinciple For point-in-polygon searcheschain groups are constructed which bisectthe universe into parts to sort the points(or nodes of polygons) into three groupsleft within the band and right Only thesecond group needs more detailed treat-ment The point set is then recursivelypartitioned until it reaches the level of theLeGDs For line-oriented problems thetopological connections of the two sets com-pared would allow intersections to belimited to immediate neighbors

Another important procedure will be anested chain-intersection routine Hereagain the chain band and its recursivesegmentation is used The number ofpoints which define a chain is constantlyincreased until the intersection test can bedetermined The search of a line througha set of polygons will use a graph-searchalgorithm developed for the GDS (Geo-graphic Data Structure) project discussedin the next section The neighborhoodsearch routines create records of neighborsfor every point or line or polygon at anyspecified depth of neighborhood

Three-Dimensional Surfaces

To implement the ideas presented herethe senior author is developing a geo-graphic information system for three-dimensional surfaces Both systems thethree-dimensional and the planar arebased on data structures with explicittopological neighborhood relationships

The basic philosophy of both approachesis to separate the data base from the appli-cation programs In the early days ofcomputer cartography a data set wasgenerally tied to the application programthat used it Now the available data have

64

become so voluminous and the applicationprograms so varied that extra efforts inthe preparation of data for more efficientcomputations seem to be justified We arewell in step with modern computer scienceto separate the data base from the applica-tion programs with the data structure be-coming the link between the two

The creation of a structured data baseis nothing new The interpolation of anirregular grid to a regular grid of height-points provides a data structure throughthe implicit neighborhood function of thegrid And in the case where polygons areindependently defined by a series of pointsthe neighborhood relationship is replacedby a search algorithm which finds theneighbors for every polygon by searchingfor matching segments in the boundaryfiles of the other polygons The idea is tospend computing time before any applica-tion has been performed in the anticipa-tion of heavy uses of the data base Anefficient structure of the data base shouldnot only speed up computations consider-ably but should also simplify the produc-tion of application programs

The data structure developed under theworking title Geographic Data Structure(GDS) is based on irregularly distributedpoints which are assumed to be samplepoints without sampling errors from asingle-valued surface Two types of struc-tures form the core of the data bases Thefirst creates neighborhood relationships bytriangulating the data set and storing forevery point the labels of all points whichare linked with the point by a triangleedge (Fig 7) The second structure isproduced by selecting those points of thesurfaces which lie along lines of high in-formation content such as ridges andchannel lines and defining them by theirnodes which are peaks passes and pits(Fig 8) This second data structureserves two purposes First ~t is a generalrepresentation of the surface for roughcomputations second it is a directoryinto the more detailed first structure

In the first structure the creation of theneighborhood relationship is based on theassumption that data-sets are usually of

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

34 16 3S 53 52 33 15

11 4 18 36 35 16 38

CHAINS

POINTE~

NODES

Fig 8 The GDS-second data structureBoth the node-file and the chain-file haveaccess to the first data structure the node-file directly and the chain-file through achain-painter-file

Gottschalk 1970) creates all possible linkschooses the shortest and eliminates alllinks which intersect the shortest Thisprocedure is repeated with the next short-est links until no links intersect The re-sult is the set of links with the minimumcumulative distance between neighboringpoints

The procedure has one disadvantagesince (~ ) links have to be created the num-ber of points is therefore limited toonly several hundred The first step inour approach therefore limits the links toa number of potential neighbors amongwhich the shortest link is chosen and in-tersected with all other links originatingin these potential neighbors This pro-cedure limits the number of tests for in-

nm2

tersections of links to less than -4-where m is the number of potential neigh-bors an arbitrary number between 8 and14 depending on the density variation ofthese points The procedure does notguarantee however that only triangles areconstructed polygons with more thanthree sides can result although they arerelatively rare The check for such poly-gons and their elimination is very easy andfast

The second possibility is to create atriangulated structure through use ofThiessen polygons A published solution(Rhynsburger 1973) intersects for everypoint the links to every other point midwayand chooses the smallest polygon created

~COIDJ-OIIIJI II POINTS POINTERS CHAIN POINTERSL Nm bull

[[]]-DI ~~u =

317 35 34 1516

Points Pcinters

POINTS POINTERSName x y bull Pointers

r--1735

~ i34

6 X Y z -- 15 -~ 17 X Y z i+6 23

18 X Y z i+13

~

-bull183635- 16

3 X Y z j 3

~

~l

----

NEIGHBORHOOD RELATIONSHIPS

TRIANGULATED POINTS

Fig 7 The GDS-first data structure Theillustration shows the points on the surfacewith their links to neighbors (edges) Theexternal representation is shown by theneighborhood relationships The internalrepresentation is composed of the point-fileand the pointer-file

two types (a) sets of irregularly-distri-buted points which were digitized with theunderstanding that every point is signif-icant and (b) sets of regularly or irregu-larly-distributed points where it is knownthat a number of points are redundant andcan be eliminated from the set eg regulargrids of points and encoded contours

The first type of data set is linked bysome type of triangulation At least twoapproaches exist The first (Dueppe and

Vol 2 No1 April 1975 65

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

by the perpendiculars Every point whichcontributes to the Thiessen polygon is aThiessen neighbor This procedure canagain be simplified by the assumption of alimited set of potential neighbors Thesame checking routines as above have tobe applied

Another approach which limits the num-ber of necessary tests but is mathemat-ically correct at the same time has beendeveloped within the project by KurtBrassel at Harvard University The pro-cedure is based on fields of potentialneighbors which converge very rapidly

The alternative to triangulation is three-dimensional generalization ie to selectfrom a set of points those which define thestructure with the least deviation from theoriginal surface The basic concept de-veloped by Randolph Franklin of HarvardUniversity and T K Peucker is to ap-proximate the surface of a series of tri-angles through a selected set of pointswhere each additional point included intothe set is the one which deviates the mostfrom the approximated triangles until thedeviations are below a given value (seePeucker 1974)

In both cases the triangulation and gen-eralization of the surface the result is alinked list of surface points The termlinked list means that points are linked withone another through pointers In otherwords a point is not only identified by its y z coordinates but also by a list of thelabels of the points which form edges oftriangles with the point In our case eachrecord consists of the t y z coordinates ofa point and a reference to the start of thepointers to the neighbors in a pointer listThe reason for not having the neighbor-hood pointers with the point record is thatthe number of points varies considerably(Mark 1974) Since the record has to belong enough to include all possible num-bers of neighbors large parts of the pointersections would be empty most of the timeThe pointers are sorted starting with thepointer the least East of North of a point(Fig 7)

The use of this type of data-structure isvery simple and efficient For every search

66

(profile contour etc) a criterion for edge-intersection is developed For contouringfor example the critical question iswhether one point of the edge is above thecontour level and the other below A startis found and one point of the edge is con-sidered a reference point and the other asubpoint The next subpoint is found bylooking up the next neighbor in the pointerlist If the test is positive the intersectionis performed and the process repeated Ifthe test is negative the reference and sub-points are switched and the process re-peated

Other procedures are equally simpleTo find a triangle for example one hasonly to have a reference and a subpointThe third point is the next label in thepointer list of the reference point after thesubpoint To find all triangles one goesthrough the total pointer list leaving outall those edges connecting reference pointswith subpoints with a smaller label sincethey would create triangles which hadbeen treated when the subpoint was areference point

Although the first data structure aspresented above seems to be efficient interms of storage capacity it does not pro-vide easy access to the data base which isoften very large I t is for this reasonthat we are developing the second datastructure to represent the general structureof the surface and to serve as a directoryto reach into the first data base (Pfaltz1975) (Fig 8)

The first step in the creation of thesecond data structure is to find the ridgechannel and break lines on the surface Ifone labels the highest point for every tri-angle the unlabeled points are membersof the channel line at least on smoothsurfaces A subsequent search routinedeals with the irregularities Points alongridges are found by eliminating the lowestpoint of every triangle using a routinedeveloped for a regular grid by D MDouglas The detection of break lines issomewhat more difficult

Some theoretical studies of surfaces byWarntz (1966) show that ridge lines andchannel lines cross at passes a useful point

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of information relative to the developmentof the second data structure Practicalconsiderations suggest that in terrain andother surfaces this regularity is not alwayspresent

Once the topological structure is at handit can be used as a directory into the firststructure The line is therefore treatedas a chain similar to the chain of thePOLYVRT system The nodes of thechain are the peaks passes pits and otherendpoints of chains on the surfaces Thesepoints are stored with their coordinatesand the names of the chains which termi-nate at the nodes The chains are storedwith the labels of the nodes and pointersinto the chain lists which consist of labelsof points in the first data structure (notethat here the chain structure differs fromthat of POLYVRT)

A third component of the GeographicData Structure should be mentioned sinceit illustrates very well the logical adapta-tion of a computer problem solution togeographical data The problem at handis the partitioning of the data set Sincewith large data sets only portions can bekept in fast memory the data base is seg-mented into pages which are broughtinto memory as units For the Geo-graphic Data Structure the paging sys-tem can solve several problems inherentin a complex geographic information sys-tem

The boundaries of patches as we callthe areal extension of a page are chainsalready defined for the second structureSince detail along the chain is of no topo-logical interest the density of points alongthe chain can differ for its two sides Inother words the density of triangles canchange from patch to patch This allowsfor very efficient data encoding even interrain with sudden changes in the sur-face behavior as at the change from amountainous area into a plain (Peucker1972)

Another advantage of the paging-systemis the ease of including topographic andplanar information Linking point lineand areal data to the triangulated pointswould lead to high definitional redundancy

Vol 2 No1 APril 1975

The secondary structure could lead lttoam-biguities where the terrain is very elon-gated Since an attempt has been made tokeep the shape of the patches as compactas possible the combination of non-terraindata with patch boundaries seems to bemost appropriate

Since the patch boundaries are againchains another virtue comes to light Thepatches can be treated as polygons of thePOLYVRT system with little difficultyThis link between the two systems lendshope that eventually they may be merged

It is an appropriate question to ask whatsuch a data structure as the GDS will beable to accomplish A number of displayroutines have already been developed(Cochrane 1974) and a series of proce-dures for surface analysis based on heuris-tic searches are underway (Fig 9) Sinceboth levels of data structure are graphswe will be able to rely on many of the de-velopments connected with operations re-search specifically network analysis forthe manipulative treatment of the data

As both systems GDS and GEOGRAFhave topological structures it is possibleto merge the two The creation of poly-gons from points is the major link fromthe GDS project to GEOGRAF Thecreation of a set of centroids for polygonsallows the conversion in the opposite direc-tion This way surfaces can be treated aspolygonal sets and can be displayed andmanipulated by the routines of GEO-GRAF Conversely polygonal data canbe treated as surfaces for GDS Theneighborhood routines are what make theproject useful in quantitative geographyand planning Neighborhood searches areextremely expensive without the topologi-cal data structure but they are usually amost important part of urban and environ-mental analyses once a general overview isobtained from the data

Although basic research and applicationdevelopment are two sides of one coin andmust go together Ito obtain lasting resultsthis paper has concentrated on the theo-retical parts of the project since their de-velopment is ahead of the application rou-tines a fact which should be expected

67

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

REGULAR GRID

~GENERALIZATION

TRIANGULATEDIRREGULAR GRID

First Data Structure

jNETWORK PROCEDURES

THEORy OFSURFACE SPECIFICPOINTS AND LINES

HIURfSTIC SEARCH

SEARCH

Profile SearchGeographic DisorientationOptimum SearchNeighborhood An~lysis

ContouringBlock DiagramShaded ContoursInclined ContoursVisibility TestRadar Image

Fig 9 The problem flow for GDS from the data base via the extraction of points andchains for the two types of data structures to their application

The quintessence of the research so far isthe hypothesis that topologically-structureddata bases of three-dimensional and planarsurfaces can result in reduced efforts inthe development and execution of applka-

REFERENCESAmidon E L and G S Aiken (1970) Al-

gorithmic Selection of the Best Method forCompressing Map Data Strings Communica-tions of the Association for Computillg Ma-chinery Vol 14 pp 769-774

68

tion routines We have some indicationthat the hypothesis is correct the real testwill come when the bulk of the applicationroutines is completed

Arms A (1970) MAPMODEL System Sys-tem Descriptioll and Users Guide Bureau ofGovernmental Research and Service Univer-sity of Oregon 59 pp

Boehm B M (1967) Tabular Representation

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of Multivariate Functions with Applicationsto Topographic Modelling Proceedings Z2ndNational Conference Association for Comput-ing Machinery pp 403-415

Brandstaetter L (1957) Exakte Schichtlinienund Topographische GelaendedarstellungOesterreichische Zeitschrift fuer VermesSlmgs-wesepound Sonderheft 18 Wien 90 pp

Cochrane Douglas (1974) Cartographic Dis-play Operations on Surface Data Collected inan Irregular Structure MA thesis SimonFraser University

Cooke D F and W F Maxfield (1967) TheDevelopment of a Geographic Base File andIts Uses for Mapping Urban and RegionalInformation Systems for Social Programs Pa-pers from the 5th Annual Conference of theUrban and Regional Information System As-sociation pp 207-218

Douglas David H (1973) BNDRYNETPeucker T K (ed) The Interactive Map inUrban Research Final Report After Year OneUniversity of British Columbia

Dueppe R D and H J Gottschalk (1970)Automatische Interpolation von Isolinien beiwillkuerlich verteilten Stuetzpunkten Allge-meine Vermessungs-Nachrichten Vol 129 No10 (Oct) pp 423-426

Grist M W (1972) Digital Ground ModelsAn Account of Recent Research Photogram-metric Record Vol 70 No4 (Oct) pp 424-441

Heiskanen W A and H Moritz (1967) Phys-ical Geodesy W H Freeman San FranciscoChapter 7

Holroyd M T and B K Bhattacharyya(1970) Automatic CO1poundtouring of GeoPhysicalData Using Riclbic Spline Interpolation Geo-logical Survey of Canada Paper No 70-55

Hsu M L et al (1975) Computer Applica-tions in Land Use Mapping and the MinnesotaLand Management Information System Da-vis ] C and M McCullagh (eds) Displayand Analysis of Spatial Data New York JohnWiley and Sons

Junkins ] L G M Miller and J R Jancaitis(1973) A Weighting Function Approach toModelling of Irregular Surfaces Journal ofGeophysical Research Vol 78 No 11 (Apr)pp 1794-1803

Kraus K (1973) A General Digital TerrainModel translation from an article in Acker-mann F (1973) Numerische Photogram-metrie Sammlung Wiechmann Neue FolgeBuchreihe Bd 5 Karlsruhe

Laboratory for Computer Graphics and Spatial

Vol 2 No1 April 1975

Analysis Harvard University (1974) POLY-VRT Manual Cambridge Mass

Mark David M (1974) A Comparison ofComputer-based Terrain Storage MethodsWith Respect to Evaluation of Certain Geo-morphometric Measures MA thesis Uni-versity of British Columbia

Morse S P (1968) A Mathematical Modelfor the Analysis of Contouring Line DataJournal of the Association for Computing Ma-chinery Vol 15 No2 pp 205-220

Nake F and T K Peucker (1972) The Inter-active Map in Urban Research Report afterYear One University of British Columbia

Nordbeck S and B Rystedt (1970) IsarithmicMaps and the Continuity of Reference IntervalFunctions Geografiska Annaler Vol 52 SerB pp 92-123

Peucker T K (1972) Computer CartographyAssociation of American Geographers CollegeGeography Commission Resource Paper No17 Washington DC

Peucker T K (ed) (1973) The InteractiveMap in Urban Research Final Report AfterYear One University of British Columbia

Peucker T K (1974) GeograPhical DataStructltres Report After Year One SimonFraser University

Pfaltz J L (1975) Surface Networks Geo-graPhical Analysis (forthcoming)

Rhynsburger D (1973) Analytic Delineationof Thiessen Polygons Geographical AnalysisVol 5 No2 (Apr) pp 133-144

Rosenfeld A (1969) Picture Processing byComputer New York Academic Press

Schmidt W (1969) The Automap SystemSurveying and Mapping Vol 29 No 1(Mar) pp 101-106

Shepard D (1968) A Two-Dimensional Inter-polation Function for Irregularly SpacedData Proceedings 23rd National ConferenceAssociation for Computing Machinery pp 517-524

Sima J (1972) Prinzipien des CS digitalenGelaendemodells Vermessungstechnik Vol20 No2 pp 48-51

Switzer P (1975) in Sampling of PlanarSurfaces Davis J C and M McCullagh(eds) DisPlay and Analysis of Spatial DataJohn Wiley and Sons New York

Tobler W R (1969) Geographical Filters andtheir Inverses GeograPhical Analysis Vol 1pp 234-253

Warntz W (1966) The Topology of Socio-Economic Terrain and Spatial Flows Papersof the Regional Science Association Vol 17pp47-61 bull

69

Page 6: +>KMIBK>JCD?,>M> :MKN?MNKALkclarke/Geography232/PeuckerChrisman.pdfDelivered by Publishing Technology to: University of California Santa Barbara Library IP: 128.111.234.232 on: Tue,

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

formation accessible to urban researchersneighborhood relationships are not madeexplicit Segments sharing a node forexample must be found by laborious searchprocedures Search is also required toassemble the outline of a polygon Moreimportantly the DIME structure is cum-hersome to use for many cartographic ap-plications involving areas made up of com-plex lines For procedures in a one-timechecking effort as is the case of AddressCoding and Address Matching in metro-politan areas it is quite adequate Forefficient computer storage and retrievaland for many applications however im-provements are desirable For examplethe reliance on the individual line segmentmakes the reduction of detail for displaypurposes difficult since line segments can-not be simply deleted without correctingthe reference codes for the affected nodes

At the Laboratory for Computer Graph-ics and Spatial Analysis the junior authorhas developed a data structure POLY-VRT that is designed to contain all theinformation needed to construct any of thepreviously enumerated planar structuresThe basic object of POLYVRT is thechain Like a DIME segment a chainhas nodes at its two ends separates twoareas and is assumed to be uncrossed Itdiffers in that the POLYVRT chain maybe made up of many points whereas thebasic DIME unit has only two points Aboundary between two polygons can bereferenced by a single chain no matterhow complicated because line detail istopologically unimportant (Laboratory forComputer Graphics and Spatial Analysis1974)

The coding of a complicated boundaryas a unit is not unique to POLYVRTThe data bank used in the project TheInteractive Map in Urban Research(Nake and Peucker 1972 Peucker (ed)1973) as well as the World Data Bank I(Schmidt 1969) are composed of linesIn the latter case some of them containover 4000 points The chain based systemof POLYVRT however is a differenttype of structure because of the topologicalrole assigned to the chain and the subse-

60

l[ucnt construction of a list data strnctureBased upon this assignment the topologi-cal information about a chain resembles theinformation on a DIME record exceptthat the distinction between nodes (iepoints used for more than one chain) andthe points internal to a chain allows in-ternal points to be eliminated without influencing the neighborhood relationshipsThe main innovation in developing a chainrepresentation is that areas of significantline detail may be efficiently handledTopological checking is reduced from de-pendency on the number of points to de-pendency on the number of boundaries

In addition to the indication of the rela-tive location of the chain with respect toits neighboring polygons POL YVRT in-formation is stored in separate lists assem-bling the bounding chains for every poly-gon Thus searches can take place in twodirections from the chain to the polygonand from the polygon to the chain Thisis very important for any type of neighbor-hood manipulation since neighboring enti-ties can be found through their boundingor bounded complements In otherwords to follow along a group of chainsone flips through chain to polygon to thenext chain etc whereas to traverse aseries of polygons one tests for adjacentpolygons by going through the chain di-rectory for each polygon

The POLYVRT program places pointinformation in secondary storage Thethree higher level objects (chains nodesand polygons) are core resident Onlythe chain refers directly to the point filein the secondary storage In addition toindicating the locations of the points in thepoint file the chain record incorporates thename of the chain the labels of the startingand ending nodes and the left and rightpolYJ~middotons Conversely the polygon listconsists of the bounding chains in propersequence (Figs 4 and 5) A list linkinsnodes to chains could easily be constructed

DATA STRUCTURES FORTHREE-DIMENSION AL SURFACES

Boehm (1967) describes in a very de-tailed analysis the advantages and disad-

The American Carfographrr

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

2 POLYGON 22 NODE

888 CHAIN 3 POINT WITHINCHAIN

Fig 4 The external representation of thePOLYVRT chain-file Every chain has aname the number of inner points (length)the two limiting nodes the two boundarypolygons and a series of coordinates for thepoints NODES

prisingly this topic has produced little dis-cussion despite the fact that extremelylarge data banks of terrain (digital terrainmodels) have been developed

When encoding surfaces it is necessaryto adapt the density of points to the varia-tion in the local terrain The question ofhow dense the points have to be can beanswered by again using the philosophy ofthe sampling theorem For the terrainwithin a typical map the variation canchange considerably resulting in a needfor frequent adjustments of the samplinginterval

In a contour map the density of contourlines changes with the density of reliefvariation Therefore it fulfills the re-quirements set by the sampling theoremfor a non-stationary surface ie a sur-face with changing terrain For the regu-lar grid on the other hand if the smallestobject one wishes to detect anywherewithin a study area is of size (wavelength) 5 then the grid spacing every-where must be 52 or less (note for ex-ample Mark 1974) The regular gridagain tends toward redundancy sincesmooth areas within the study area willcontain far more points than are needed toaccurately portray their form To improvethe resolution of a grid by a factor oft the grid spacing must be decreased by

44 11 3

TO LEFT RIGHT

22 77 0

5

18

x1ylmiddotmiddotmiddotmiddotmiddotXSy5

4 55 77 2

11

111

999

11

Name J

Fig 5 The internal representation of thePOLYVRT chain-file Reference to poly-gon chains are positive or negative accord-ing to the direction of the chain used

vantages of different ways of encoding sur-faces He comes to the conclusion that theencoding of surfaces by contours minimizesthe storage capacity whereas a regulargrid of surface points minimizes the com-puting time necessary for several types ofmanipulations Boehm did not include inhis study a data system with an irregulardistribution of points but only contour en-codings and regular grid structures of con-stant and variable mesh width He didnot reflect on the reasons why contoureddata minimize storage capacity whereas aregular grid minimizes computing time Ifhe had done so it is quite possible that thedevelopment of geographic data baseswould have taken different routes Sur-

22 X Y

Y

77 Y

CHAINSName Poinls from to Ltlll Right

888 22 55

222 55 77

77

POINTS

Secondary Storagex y detail

POLYGONSUst Point Namemiddotm

al 2 No1 April 1975 61

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

the reciprocal of this factor and the totalnumber of points is increased by p

The question has been raised many timesin photogral11t11etrywhether contours arethe best representatives of topographicsurfaces It has been stated that contoursdo not detect many types of breaks whichare frequent on terrain (eg Brandstatter1957) Therefore the encoding of sur-faces by vertical profiles has beenattemptedmiddot many times in photogrammetryin recent years Points are encoded onlywhen the slope of the surface changes(eg Kraus 1973) In another case thebreak lines are encoded in addition to aregular grid (Sima 1972) and in anotherapproach only the break lines or only theridges and channel lines of a surface areencoded (Grist 1972) The amount ofdetail with these approaches depends onthe scale used

vVhen performing numerical computa-tions on the basis of these digital terrainmodels the quantity of data involved willbe only one determining factor for theamount of programming and computationsneeded 11ost of the numerical computa-tions on surfaces require some type ofneighborhood function either to computesome surface behavior such as slope orlocal variation of relief or to find the nextunit for the drawing of a contour or avertical profile For a set of contours itis relatively easy to create a directorywhich indicates a sequence of contours ina type of tree in which the surroundingcontour is the base and the other contoursare the branches (Morse 1968) To findthe neighboring points on the two ad-jacent contours for a given point on anintermediate contour however one mustsearch through all the points on the twoadjacent contours and compute the dis-tances to the point in question This pro-cedure is time-consuming if the contourlines are represented by very many pointsA regular grid on the other hand has animplicit neighborhood function and findinga neighbor does not involve search norextra computer time For a set of irregu-larly-distributed points as they are repre-sented in very simple data structures (eg

62

SYMAP) the creation of a neighborhoodfunction is usually performed by findingthe closest small number of points wherethe number varies around six

THE GENERAL CASE

I t has been noted that although planesurfaces and three-dimensional surfaceslook rather different it takes only a fewassumptions to treat one as the subset ofthe other While three-dimensional sur-faces are always based on interval or ratiodata planar surfaces often involve ordinaland nominal data However it has beenshown that ordinal and nominal data maybe treated as interval data (N ordbeck andRystedt 1970 Rosenfeld 1969) and evenwithout this conversion we can combinethe two types into one general case bysimply using different assumptions aboutneighborhood

Given a set of n data points for whichx y and z coordinates are known con-tinuously defined surfaces (ie surfacesfor which one and only one value existsat every point) can be created using anyof three different assumptions about thesurface behavior (Peucker 1972) Thefirst assumption is that of a stepped sur-face which says that the surface retainsthe value of a data point within the neigh-borhood of that data point where neigh-borhood is defined either by a given poly-gon (the choropleth approach) or by thefact that the area is closer to one datapoint than to any other one in the neighbor-hood (the proximal approach) The sec-ond assumption is that each data pointrepresents a sample of a single value on aconstantly changing surface Neighbor-hood is then a number of closest neighbor-ing data points and intervening values areinterpolated with different types of in-terpolation procedures The third assump-tion is that the data point is a sample froma constantly changing surface that maycontain errors thus the data point is notnecessarily located on the surface butclose to it This approach implies thefurther assumption that the actual surfaceis smoother than the surface constructedthrough the sample data points

The American CartograPher

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

linFeatures

ChainGroup

vVith these assumptions about surfacebehavior any point or areal distributionof a variable can be treated as a continuousfunction z = f (x y) and planar and three-dimensional surfaces can be combined intoone type This has already been done insome computer progranls the most notablebeing SYMAP Many cartographic ap-plications must treat both types of sur-faces and therefore data structures mustbe developed which can handle both typesat one time

THE PROPOSED DATA STRUCTURES

Planar Surfaces

The need to incorporate different typesor hierarchies of polygons uncovered oneof the limiting assumptions made inPOLYVRT A chain plays a dual rolefirst it is the boundary of two areal enti-ties and secondly it is the unbroken unitof point retrieval This is not a problemif one is limited to a single nonoverlappingset of polygons The following proposednew system to bear the name GEOGRAFis an attempt at greater flexibility

Because of the addition of many layersof complexity involving multiple polygonsets the chain cannot remain to the samedegree the controlling object of the datastructure Just as the notion of an un-broken line is important so is the notionof an unpartitioned space In a systemwhich must handle overlapping polygonnetworks there is a need for a root ob-ject which is defined as an area uncut byany further partitioning This object istermed the Least Common GeographicUnit (LCGU) The LCGUs are con-structed as a POLYVRT polygon directlyfrom chains The relationship of theLCGDs to all other polygon types is hi-erarchical (Fig 6)

In turn the existence of the LCGUallows for the creation of each class ofpolygon In order to allow simple codingof the boundary relationships at thesehigher levels in the structure the chaingroup was devised A chain group is aset of chains which form a boundary of twoareal units for a given polygon classThese polygons are constructed of chain

0 2 No1 APril 1975

Cha DGroup === LiI I

EJ 8ha0ENodes -- -- LCGU 0

- - -LJ ~ 1

RFig 6 The relationship between the vari-ous parts of the data structure

groups which in turn are constructedfrom chains

Line features can be built up of chainsin the same manner as chain groupsNote that the chain group listing for eachlevel of polygon and each listing for a linefeature reference only the chains them-selves This allows each system to beconsidered as a separate directory whichis core-resident only when that class ofobjects is retrieved

The LCGU has other implications andapplications that are useful because of thetopological data structure The LCGUwith its coding for each of the polygonsets can be combined with contiguity in-formation of linear feature types to producean Attribute Cross Reference (A CR) The ACR is a table in which all objects(in polygon and linear systems) are cross-referenced to each other to determinenesting By using attributes of chains(lengths) and LCGUs (areas populationdensities etc) this cross-referencing cap-ability can assign a string of data collectedfor one polygon type to a string of asecond type

Topological manipulation routines arecentral to the success of this structureThe intersection of geographic featureswill rely on topological knowledge to re-alize economies of scale in processing largefiles All operations with lines will actu-ally work with bands built with endpoints

63

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of the line and the furthest deviants toboth sides (if the bands become too widethe lines are split etc) With thisapproach the windowing process uses non-linear windows (which is often the casewith geographic coordinates and map pro-jections) and becomes quite elegant Sim-ilarly intersection procedures such aspoint-in-polygon line-across-polygon andpolygon-over-polygon determination allowfor gainful application of the topologicalprinciple For point-in-polygon searcheschain groups are constructed which bisectthe universe into parts to sort the points(or nodes of polygons) into three groupsleft within the band and right Only thesecond group needs more detailed treat-ment The point set is then recursivelypartitioned until it reaches the level of theLeGDs For line-oriented problems thetopological connections of the two sets com-pared would allow intersections to belimited to immediate neighbors

Another important procedure will be anested chain-intersection routine Hereagain the chain band and its recursivesegmentation is used The number ofpoints which define a chain is constantlyincreased until the intersection test can bedetermined The search of a line througha set of polygons will use a graph-searchalgorithm developed for the GDS (Geo-graphic Data Structure) project discussedin the next section The neighborhoodsearch routines create records of neighborsfor every point or line or polygon at anyspecified depth of neighborhood

Three-Dimensional Surfaces

To implement the ideas presented herethe senior author is developing a geo-graphic information system for three-dimensional surfaces Both systems thethree-dimensional and the planar arebased on data structures with explicittopological neighborhood relationships

The basic philosophy of both approachesis to separate the data base from the appli-cation programs In the early days ofcomputer cartography a data set wasgenerally tied to the application programthat used it Now the available data have

64

become so voluminous and the applicationprograms so varied that extra efforts inthe preparation of data for more efficientcomputations seem to be justified We arewell in step with modern computer scienceto separate the data base from the applica-tion programs with the data structure be-coming the link between the two

The creation of a structured data baseis nothing new The interpolation of anirregular grid to a regular grid of height-points provides a data structure throughthe implicit neighborhood function of thegrid And in the case where polygons areindependently defined by a series of pointsthe neighborhood relationship is replacedby a search algorithm which finds theneighbors for every polygon by searchingfor matching segments in the boundaryfiles of the other polygons The idea is tospend computing time before any applica-tion has been performed in the anticipa-tion of heavy uses of the data base Anefficient structure of the data base shouldnot only speed up computations consider-ably but should also simplify the produc-tion of application programs

The data structure developed under theworking title Geographic Data Structure(GDS) is based on irregularly distributedpoints which are assumed to be samplepoints without sampling errors from asingle-valued surface Two types of struc-tures form the core of the data bases Thefirst creates neighborhood relationships bytriangulating the data set and storing forevery point the labels of all points whichare linked with the point by a triangleedge (Fig 7) The second structure isproduced by selecting those points of thesurfaces which lie along lines of high in-formation content such as ridges andchannel lines and defining them by theirnodes which are peaks passes and pits(Fig 8) This second data structureserves two purposes First ~t is a generalrepresentation of the surface for roughcomputations second it is a directoryinto the more detailed first structure

In the first structure the creation of theneighborhood relationship is based on theassumption that data-sets are usually of

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

34 16 3S 53 52 33 15

11 4 18 36 35 16 38

CHAINS

POINTE~

NODES

Fig 8 The GDS-second data structureBoth the node-file and the chain-file haveaccess to the first data structure the node-file directly and the chain-file through achain-painter-file

Gottschalk 1970) creates all possible linkschooses the shortest and eliminates alllinks which intersect the shortest Thisprocedure is repeated with the next short-est links until no links intersect The re-sult is the set of links with the minimumcumulative distance between neighboringpoints

The procedure has one disadvantagesince (~ ) links have to be created the num-ber of points is therefore limited toonly several hundred The first step inour approach therefore limits the links toa number of potential neighbors amongwhich the shortest link is chosen and in-tersected with all other links originatingin these potential neighbors This pro-cedure limits the number of tests for in-

nm2

tersections of links to less than -4-where m is the number of potential neigh-bors an arbitrary number between 8 and14 depending on the density variation ofthese points The procedure does notguarantee however that only triangles areconstructed polygons with more thanthree sides can result although they arerelatively rare The check for such poly-gons and their elimination is very easy andfast

The second possibility is to create atriangulated structure through use ofThiessen polygons A published solution(Rhynsburger 1973) intersects for everypoint the links to every other point midwayand chooses the smallest polygon created

~COIDJ-OIIIJI II POINTS POINTERS CHAIN POINTERSL Nm bull

[[]]-DI ~~u =

317 35 34 1516

Points Pcinters

POINTS POINTERSName x y bull Pointers

r--1735

~ i34

6 X Y z -- 15 -~ 17 X Y z i+6 23

18 X Y z i+13

~

-bull183635- 16

3 X Y z j 3

~

~l

----

NEIGHBORHOOD RELATIONSHIPS

TRIANGULATED POINTS

Fig 7 The GDS-first data structure Theillustration shows the points on the surfacewith their links to neighbors (edges) Theexternal representation is shown by theneighborhood relationships The internalrepresentation is composed of the point-fileand the pointer-file

two types (a) sets of irregularly-distri-buted points which were digitized with theunderstanding that every point is signif-icant and (b) sets of regularly or irregu-larly-distributed points where it is knownthat a number of points are redundant andcan be eliminated from the set eg regulargrids of points and encoded contours

The first type of data set is linked bysome type of triangulation At least twoapproaches exist The first (Dueppe and

Vol 2 No1 April 1975 65

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

by the perpendiculars Every point whichcontributes to the Thiessen polygon is aThiessen neighbor This procedure canagain be simplified by the assumption of alimited set of potential neighbors Thesame checking routines as above have tobe applied

Another approach which limits the num-ber of necessary tests but is mathemat-ically correct at the same time has beendeveloped within the project by KurtBrassel at Harvard University The pro-cedure is based on fields of potentialneighbors which converge very rapidly

The alternative to triangulation is three-dimensional generalization ie to selectfrom a set of points those which define thestructure with the least deviation from theoriginal surface The basic concept de-veloped by Randolph Franklin of HarvardUniversity and T K Peucker is to ap-proximate the surface of a series of tri-angles through a selected set of pointswhere each additional point included intothe set is the one which deviates the mostfrom the approximated triangles until thedeviations are below a given value (seePeucker 1974)

In both cases the triangulation and gen-eralization of the surface the result is alinked list of surface points The termlinked list means that points are linked withone another through pointers In otherwords a point is not only identified by its y z coordinates but also by a list of thelabels of the points which form edges oftriangles with the point In our case eachrecord consists of the t y z coordinates ofa point and a reference to the start of thepointers to the neighbors in a pointer listThe reason for not having the neighbor-hood pointers with the point record is thatthe number of points varies considerably(Mark 1974) Since the record has to belong enough to include all possible num-bers of neighbors large parts of the pointersections would be empty most of the timeThe pointers are sorted starting with thepointer the least East of North of a point(Fig 7)

The use of this type of data-structure isvery simple and efficient For every search

66

(profile contour etc) a criterion for edge-intersection is developed For contouringfor example the critical question iswhether one point of the edge is above thecontour level and the other below A startis found and one point of the edge is con-sidered a reference point and the other asubpoint The next subpoint is found bylooking up the next neighbor in the pointerlist If the test is positive the intersectionis performed and the process repeated Ifthe test is negative the reference and sub-points are switched and the process re-peated

Other procedures are equally simpleTo find a triangle for example one hasonly to have a reference and a subpointThe third point is the next label in thepointer list of the reference point after thesubpoint To find all triangles one goesthrough the total pointer list leaving outall those edges connecting reference pointswith subpoints with a smaller label sincethey would create triangles which hadbeen treated when the subpoint was areference point

Although the first data structure aspresented above seems to be efficient interms of storage capacity it does not pro-vide easy access to the data base which isoften very large I t is for this reasonthat we are developing the second datastructure to represent the general structureof the surface and to serve as a directoryto reach into the first data base (Pfaltz1975) (Fig 8)

The first step in the creation of thesecond data structure is to find the ridgechannel and break lines on the surface Ifone labels the highest point for every tri-angle the unlabeled points are membersof the channel line at least on smoothsurfaces A subsequent search routinedeals with the irregularities Points alongridges are found by eliminating the lowestpoint of every triangle using a routinedeveloped for a regular grid by D MDouglas The detection of break lines issomewhat more difficult

Some theoretical studies of surfaces byWarntz (1966) show that ridge lines andchannel lines cross at passes a useful point

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of information relative to the developmentof the second data structure Practicalconsiderations suggest that in terrain andother surfaces this regularity is not alwayspresent

Once the topological structure is at handit can be used as a directory into the firststructure The line is therefore treatedas a chain similar to the chain of thePOLYVRT system The nodes of thechain are the peaks passes pits and otherendpoints of chains on the surfaces Thesepoints are stored with their coordinatesand the names of the chains which termi-nate at the nodes The chains are storedwith the labels of the nodes and pointersinto the chain lists which consist of labelsof points in the first data structure (notethat here the chain structure differs fromthat of POLYVRT)

A third component of the GeographicData Structure should be mentioned sinceit illustrates very well the logical adapta-tion of a computer problem solution togeographical data The problem at handis the partitioning of the data set Sincewith large data sets only portions can bekept in fast memory the data base is seg-mented into pages which are broughtinto memory as units For the Geo-graphic Data Structure the paging sys-tem can solve several problems inherentin a complex geographic information sys-tem

The boundaries of patches as we callthe areal extension of a page are chainsalready defined for the second structureSince detail along the chain is of no topo-logical interest the density of points alongthe chain can differ for its two sides Inother words the density of triangles canchange from patch to patch This allowsfor very efficient data encoding even interrain with sudden changes in the sur-face behavior as at the change from amountainous area into a plain (Peucker1972)

Another advantage of the paging-systemis the ease of including topographic andplanar information Linking point lineand areal data to the triangulated pointswould lead to high definitional redundancy

Vol 2 No1 APril 1975

The secondary structure could lead lttoam-biguities where the terrain is very elon-gated Since an attempt has been made tokeep the shape of the patches as compactas possible the combination of non-terraindata with patch boundaries seems to bemost appropriate

Since the patch boundaries are againchains another virtue comes to light Thepatches can be treated as polygons of thePOLYVRT system with little difficultyThis link between the two systems lendshope that eventually they may be merged

It is an appropriate question to ask whatsuch a data structure as the GDS will beable to accomplish A number of displayroutines have already been developed(Cochrane 1974) and a series of proce-dures for surface analysis based on heuris-tic searches are underway (Fig 9) Sinceboth levels of data structure are graphswe will be able to rely on many of the de-velopments connected with operations re-search specifically network analysis forthe manipulative treatment of the data

As both systems GDS and GEOGRAFhave topological structures it is possibleto merge the two The creation of poly-gons from points is the major link fromthe GDS project to GEOGRAF Thecreation of a set of centroids for polygonsallows the conversion in the opposite direc-tion This way surfaces can be treated aspolygonal sets and can be displayed andmanipulated by the routines of GEO-GRAF Conversely polygonal data canbe treated as surfaces for GDS Theneighborhood routines are what make theproject useful in quantitative geographyand planning Neighborhood searches areextremely expensive without the topologi-cal data structure but they are usually amost important part of urban and environ-mental analyses once a general overview isobtained from the data

Although basic research and applicationdevelopment are two sides of one coin andmust go together Ito obtain lasting resultsthis paper has concentrated on the theo-retical parts of the project since their de-velopment is ahead of the application rou-tines a fact which should be expected

67

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

REGULAR GRID

~GENERALIZATION

TRIANGULATEDIRREGULAR GRID

First Data Structure

jNETWORK PROCEDURES

THEORy OFSURFACE SPECIFICPOINTS AND LINES

HIURfSTIC SEARCH

SEARCH

Profile SearchGeographic DisorientationOptimum SearchNeighborhood An~lysis

ContouringBlock DiagramShaded ContoursInclined ContoursVisibility TestRadar Image

Fig 9 The problem flow for GDS from the data base via the extraction of points andchains for the two types of data structures to their application

The quintessence of the research so far isthe hypothesis that topologically-structureddata bases of three-dimensional and planarsurfaces can result in reduced efforts inthe development and execution of applka-

REFERENCESAmidon E L and G S Aiken (1970) Al-

gorithmic Selection of the Best Method forCompressing Map Data Strings Communica-tions of the Association for Computillg Ma-chinery Vol 14 pp 769-774

68

tion routines We have some indicationthat the hypothesis is correct the real testwill come when the bulk of the applicationroutines is completed

Arms A (1970) MAPMODEL System Sys-tem Descriptioll and Users Guide Bureau ofGovernmental Research and Service Univer-sity of Oregon 59 pp

Boehm B M (1967) Tabular Representation

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of Multivariate Functions with Applicationsto Topographic Modelling Proceedings Z2ndNational Conference Association for Comput-ing Machinery pp 403-415

Brandstaetter L (1957) Exakte Schichtlinienund Topographische GelaendedarstellungOesterreichische Zeitschrift fuer VermesSlmgs-wesepound Sonderheft 18 Wien 90 pp

Cochrane Douglas (1974) Cartographic Dis-play Operations on Surface Data Collected inan Irregular Structure MA thesis SimonFraser University

Cooke D F and W F Maxfield (1967) TheDevelopment of a Geographic Base File andIts Uses for Mapping Urban and RegionalInformation Systems for Social Programs Pa-pers from the 5th Annual Conference of theUrban and Regional Information System As-sociation pp 207-218

Douglas David H (1973) BNDRYNETPeucker T K (ed) The Interactive Map inUrban Research Final Report After Year OneUniversity of British Columbia

Dueppe R D and H J Gottschalk (1970)Automatische Interpolation von Isolinien beiwillkuerlich verteilten Stuetzpunkten Allge-meine Vermessungs-Nachrichten Vol 129 No10 (Oct) pp 423-426

Grist M W (1972) Digital Ground ModelsAn Account of Recent Research Photogram-metric Record Vol 70 No4 (Oct) pp 424-441

Heiskanen W A and H Moritz (1967) Phys-ical Geodesy W H Freeman San FranciscoChapter 7

Holroyd M T and B K Bhattacharyya(1970) Automatic CO1poundtouring of GeoPhysicalData Using Riclbic Spline Interpolation Geo-logical Survey of Canada Paper No 70-55

Hsu M L et al (1975) Computer Applica-tions in Land Use Mapping and the MinnesotaLand Management Information System Da-vis ] C and M McCullagh (eds) Displayand Analysis of Spatial Data New York JohnWiley and Sons

Junkins ] L G M Miller and J R Jancaitis(1973) A Weighting Function Approach toModelling of Irregular Surfaces Journal ofGeophysical Research Vol 78 No 11 (Apr)pp 1794-1803

Kraus K (1973) A General Digital TerrainModel translation from an article in Acker-mann F (1973) Numerische Photogram-metrie Sammlung Wiechmann Neue FolgeBuchreihe Bd 5 Karlsruhe

Laboratory for Computer Graphics and Spatial

Vol 2 No1 April 1975

Analysis Harvard University (1974) POLY-VRT Manual Cambridge Mass

Mark David M (1974) A Comparison ofComputer-based Terrain Storage MethodsWith Respect to Evaluation of Certain Geo-morphometric Measures MA thesis Uni-versity of British Columbia

Morse S P (1968) A Mathematical Modelfor the Analysis of Contouring Line DataJournal of the Association for Computing Ma-chinery Vol 15 No2 pp 205-220

Nake F and T K Peucker (1972) The Inter-active Map in Urban Research Report afterYear One University of British Columbia

Nordbeck S and B Rystedt (1970) IsarithmicMaps and the Continuity of Reference IntervalFunctions Geografiska Annaler Vol 52 SerB pp 92-123

Peucker T K (1972) Computer CartographyAssociation of American Geographers CollegeGeography Commission Resource Paper No17 Washington DC

Peucker T K (ed) (1973) The InteractiveMap in Urban Research Final Report AfterYear One University of British Columbia

Peucker T K (1974) GeograPhical DataStructltres Report After Year One SimonFraser University

Pfaltz J L (1975) Surface Networks Geo-graPhical Analysis (forthcoming)

Rhynsburger D (1973) Analytic Delineationof Thiessen Polygons Geographical AnalysisVol 5 No2 (Apr) pp 133-144

Rosenfeld A (1969) Picture Processing byComputer New York Academic Press

Schmidt W (1969) The Automap SystemSurveying and Mapping Vol 29 No 1(Mar) pp 101-106

Shepard D (1968) A Two-Dimensional Inter-polation Function for Irregularly SpacedData Proceedings 23rd National ConferenceAssociation for Computing Machinery pp 517-524

Sima J (1972) Prinzipien des CS digitalenGelaendemodells Vermessungstechnik Vol20 No2 pp 48-51

Switzer P (1975) in Sampling of PlanarSurfaces Davis J C and M McCullagh(eds) DisPlay and Analysis of Spatial DataJohn Wiley and Sons New York

Tobler W R (1969) Geographical Filters andtheir Inverses GeograPhical Analysis Vol 1pp 234-253

Warntz W (1966) The Topology of Socio-Economic Terrain and Spatial Flows Papersof the Regional Science Association Vol 17pp47-61 bull

69

Page 7: +>KMIBK>JCD?,>M> :MKN?MNKALkclarke/Geography232/PeuckerChrisman.pdfDelivered by Publishing Technology to: University of California Santa Barbara Library IP: 128.111.234.232 on: Tue,

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

2 POLYGON 22 NODE

888 CHAIN 3 POINT WITHINCHAIN

Fig 4 The external representation of thePOLYVRT chain-file Every chain has aname the number of inner points (length)the two limiting nodes the two boundarypolygons and a series of coordinates for thepoints NODES

prisingly this topic has produced little dis-cussion despite the fact that extremelylarge data banks of terrain (digital terrainmodels) have been developed

When encoding surfaces it is necessaryto adapt the density of points to the varia-tion in the local terrain The question ofhow dense the points have to be can beanswered by again using the philosophy ofthe sampling theorem For the terrainwithin a typical map the variation canchange considerably resulting in a needfor frequent adjustments of the samplinginterval

In a contour map the density of contourlines changes with the density of reliefvariation Therefore it fulfills the re-quirements set by the sampling theoremfor a non-stationary surface ie a sur-face with changing terrain For the regu-lar grid on the other hand if the smallestobject one wishes to detect anywherewithin a study area is of size (wavelength) 5 then the grid spacing every-where must be 52 or less (note for ex-ample Mark 1974) The regular gridagain tends toward redundancy sincesmooth areas within the study area willcontain far more points than are needed toaccurately portray their form To improvethe resolution of a grid by a factor oft the grid spacing must be decreased by

44 11 3

TO LEFT RIGHT

22 77 0

5

18

x1ylmiddotmiddotmiddotmiddotmiddotXSy5

4 55 77 2

11

111

999

11

Name J

Fig 5 The internal representation of thePOLYVRT chain-file Reference to poly-gon chains are positive or negative accord-ing to the direction of the chain used

vantages of different ways of encoding sur-faces He comes to the conclusion that theencoding of surfaces by contours minimizesthe storage capacity whereas a regulargrid of surface points minimizes the com-puting time necessary for several types ofmanipulations Boehm did not include inhis study a data system with an irregulardistribution of points but only contour en-codings and regular grid structures of con-stant and variable mesh width He didnot reflect on the reasons why contoureddata minimize storage capacity whereas aregular grid minimizes computing time Ifhe had done so it is quite possible that thedevelopment of geographic data baseswould have taken different routes Sur-

22 X Y

Y

77 Y

CHAINSName Poinls from to Ltlll Right

888 22 55

222 55 77

77

POINTS

Secondary Storagex y detail

POLYGONSUst Point Namemiddotm

al 2 No1 April 1975 61

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

the reciprocal of this factor and the totalnumber of points is increased by p

The question has been raised many timesin photogral11t11etrywhether contours arethe best representatives of topographicsurfaces It has been stated that contoursdo not detect many types of breaks whichare frequent on terrain (eg Brandstatter1957) Therefore the encoding of sur-faces by vertical profiles has beenattemptedmiddot many times in photogrammetryin recent years Points are encoded onlywhen the slope of the surface changes(eg Kraus 1973) In another case thebreak lines are encoded in addition to aregular grid (Sima 1972) and in anotherapproach only the break lines or only theridges and channel lines of a surface areencoded (Grist 1972) The amount ofdetail with these approaches depends onthe scale used

vVhen performing numerical computa-tions on the basis of these digital terrainmodels the quantity of data involved willbe only one determining factor for theamount of programming and computationsneeded 11ost of the numerical computa-tions on surfaces require some type ofneighborhood function either to computesome surface behavior such as slope orlocal variation of relief or to find the nextunit for the drawing of a contour or avertical profile For a set of contours itis relatively easy to create a directorywhich indicates a sequence of contours ina type of tree in which the surroundingcontour is the base and the other contoursare the branches (Morse 1968) To findthe neighboring points on the two ad-jacent contours for a given point on anintermediate contour however one mustsearch through all the points on the twoadjacent contours and compute the dis-tances to the point in question This pro-cedure is time-consuming if the contourlines are represented by very many pointsA regular grid on the other hand has animplicit neighborhood function and findinga neighbor does not involve search norextra computer time For a set of irregu-larly-distributed points as they are repre-sented in very simple data structures (eg

62

SYMAP) the creation of a neighborhoodfunction is usually performed by findingthe closest small number of points wherethe number varies around six

THE GENERAL CASE

I t has been noted that although planesurfaces and three-dimensional surfaceslook rather different it takes only a fewassumptions to treat one as the subset ofthe other While three-dimensional sur-faces are always based on interval or ratiodata planar surfaces often involve ordinaland nominal data However it has beenshown that ordinal and nominal data maybe treated as interval data (N ordbeck andRystedt 1970 Rosenfeld 1969) and evenwithout this conversion we can combinethe two types into one general case bysimply using different assumptions aboutneighborhood

Given a set of n data points for whichx y and z coordinates are known con-tinuously defined surfaces (ie surfacesfor which one and only one value existsat every point) can be created using anyof three different assumptions about thesurface behavior (Peucker 1972) Thefirst assumption is that of a stepped sur-face which says that the surface retainsthe value of a data point within the neigh-borhood of that data point where neigh-borhood is defined either by a given poly-gon (the choropleth approach) or by thefact that the area is closer to one datapoint than to any other one in the neighbor-hood (the proximal approach) The sec-ond assumption is that each data pointrepresents a sample of a single value on aconstantly changing surface Neighbor-hood is then a number of closest neighbor-ing data points and intervening values areinterpolated with different types of in-terpolation procedures The third assump-tion is that the data point is a sample froma constantly changing surface that maycontain errors thus the data point is notnecessarily located on the surface butclose to it This approach implies thefurther assumption that the actual surfaceis smoother than the surface constructedthrough the sample data points

The American CartograPher

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

linFeatures

ChainGroup

vVith these assumptions about surfacebehavior any point or areal distributionof a variable can be treated as a continuousfunction z = f (x y) and planar and three-dimensional surfaces can be combined intoone type This has already been done insome computer progranls the most notablebeing SYMAP Many cartographic ap-plications must treat both types of sur-faces and therefore data structures mustbe developed which can handle both typesat one time

THE PROPOSED DATA STRUCTURES

Planar Surfaces

The need to incorporate different typesor hierarchies of polygons uncovered oneof the limiting assumptions made inPOLYVRT A chain plays a dual rolefirst it is the boundary of two areal enti-ties and secondly it is the unbroken unitof point retrieval This is not a problemif one is limited to a single nonoverlappingset of polygons The following proposednew system to bear the name GEOGRAFis an attempt at greater flexibility

Because of the addition of many layersof complexity involving multiple polygonsets the chain cannot remain to the samedegree the controlling object of the datastructure Just as the notion of an un-broken line is important so is the notionof an unpartitioned space In a systemwhich must handle overlapping polygonnetworks there is a need for a root ob-ject which is defined as an area uncut byany further partitioning This object istermed the Least Common GeographicUnit (LCGU) The LCGUs are con-structed as a POLYVRT polygon directlyfrom chains The relationship of theLCGDs to all other polygon types is hi-erarchical (Fig 6)

In turn the existence of the LCGUallows for the creation of each class ofpolygon In order to allow simple codingof the boundary relationships at thesehigher levels in the structure the chaingroup was devised A chain group is aset of chains which form a boundary of twoareal units for a given polygon classThese polygons are constructed of chain

0 2 No1 APril 1975

Cha DGroup === LiI I

EJ 8ha0ENodes -- -- LCGU 0

- - -LJ ~ 1

RFig 6 The relationship between the vari-ous parts of the data structure

groups which in turn are constructedfrom chains

Line features can be built up of chainsin the same manner as chain groupsNote that the chain group listing for eachlevel of polygon and each listing for a linefeature reference only the chains them-selves This allows each system to beconsidered as a separate directory whichis core-resident only when that class ofobjects is retrieved

The LCGU has other implications andapplications that are useful because of thetopological data structure The LCGUwith its coding for each of the polygonsets can be combined with contiguity in-formation of linear feature types to producean Attribute Cross Reference (A CR) The ACR is a table in which all objects(in polygon and linear systems) are cross-referenced to each other to determinenesting By using attributes of chains(lengths) and LCGUs (areas populationdensities etc) this cross-referencing cap-ability can assign a string of data collectedfor one polygon type to a string of asecond type

Topological manipulation routines arecentral to the success of this structureThe intersection of geographic featureswill rely on topological knowledge to re-alize economies of scale in processing largefiles All operations with lines will actu-ally work with bands built with endpoints

63

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of the line and the furthest deviants toboth sides (if the bands become too widethe lines are split etc) With thisapproach the windowing process uses non-linear windows (which is often the casewith geographic coordinates and map pro-jections) and becomes quite elegant Sim-ilarly intersection procedures such aspoint-in-polygon line-across-polygon andpolygon-over-polygon determination allowfor gainful application of the topologicalprinciple For point-in-polygon searcheschain groups are constructed which bisectthe universe into parts to sort the points(or nodes of polygons) into three groupsleft within the band and right Only thesecond group needs more detailed treat-ment The point set is then recursivelypartitioned until it reaches the level of theLeGDs For line-oriented problems thetopological connections of the two sets com-pared would allow intersections to belimited to immediate neighbors

Another important procedure will be anested chain-intersection routine Hereagain the chain band and its recursivesegmentation is used The number ofpoints which define a chain is constantlyincreased until the intersection test can bedetermined The search of a line througha set of polygons will use a graph-searchalgorithm developed for the GDS (Geo-graphic Data Structure) project discussedin the next section The neighborhoodsearch routines create records of neighborsfor every point or line or polygon at anyspecified depth of neighborhood

Three-Dimensional Surfaces

To implement the ideas presented herethe senior author is developing a geo-graphic information system for three-dimensional surfaces Both systems thethree-dimensional and the planar arebased on data structures with explicittopological neighborhood relationships

The basic philosophy of both approachesis to separate the data base from the appli-cation programs In the early days ofcomputer cartography a data set wasgenerally tied to the application programthat used it Now the available data have

64

become so voluminous and the applicationprograms so varied that extra efforts inthe preparation of data for more efficientcomputations seem to be justified We arewell in step with modern computer scienceto separate the data base from the applica-tion programs with the data structure be-coming the link between the two

The creation of a structured data baseis nothing new The interpolation of anirregular grid to a regular grid of height-points provides a data structure throughthe implicit neighborhood function of thegrid And in the case where polygons areindependently defined by a series of pointsthe neighborhood relationship is replacedby a search algorithm which finds theneighbors for every polygon by searchingfor matching segments in the boundaryfiles of the other polygons The idea is tospend computing time before any applica-tion has been performed in the anticipa-tion of heavy uses of the data base Anefficient structure of the data base shouldnot only speed up computations consider-ably but should also simplify the produc-tion of application programs

The data structure developed under theworking title Geographic Data Structure(GDS) is based on irregularly distributedpoints which are assumed to be samplepoints without sampling errors from asingle-valued surface Two types of struc-tures form the core of the data bases Thefirst creates neighborhood relationships bytriangulating the data set and storing forevery point the labels of all points whichare linked with the point by a triangleedge (Fig 7) The second structure isproduced by selecting those points of thesurfaces which lie along lines of high in-formation content such as ridges andchannel lines and defining them by theirnodes which are peaks passes and pits(Fig 8) This second data structureserves two purposes First ~t is a generalrepresentation of the surface for roughcomputations second it is a directoryinto the more detailed first structure

In the first structure the creation of theneighborhood relationship is based on theassumption that data-sets are usually of

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

34 16 3S 53 52 33 15

11 4 18 36 35 16 38

CHAINS

POINTE~

NODES

Fig 8 The GDS-second data structureBoth the node-file and the chain-file haveaccess to the first data structure the node-file directly and the chain-file through achain-painter-file

Gottschalk 1970) creates all possible linkschooses the shortest and eliminates alllinks which intersect the shortest Thisprocedure is repeated with the next short-est links until no links intersect The re-sult is the set of links with the minimumcumulative distance between neighboringpoints

The procedure has one disadvantagesince (~ ) links have to be created the num-ber of points is therefore limited toonly several hundred The first step inour approach therefore limits the links toa number of potential neighbors amongwhich the shortest link is chosen and in-tersected with all other links originatingin these potential neighbors This pro-cedure limits the number of tests for in-

nm2

tersections of links to less than -4-where m is the number of potential neigh-bors an arbitrary number between 8 and14 depending on the density variation ofthese points The procedure does notguarantee however that only triangles areconstructed polygons with more thanthree sides can result although they arerelatively rare The check for such poly-gons and their elimination is very easy andfast

The second possibility is to create atriangulated structure through use ofThiessen polygons A published solution(Rhynsburger 1973) intersects for everypoint the links to every other point midwayand chooses the smallest polygon created

~COIDJ-OIIIJI II POINTS POINTERS CHAIN POINTERSL Nm bull

[[]]-DI ~~u =

317 35 34 1516

Points Pcinters

POINTS POINTERSName x y bull Pointers

r--1735

~ i34

6 X Y z -- 15 -~ 17 X Y z i+6 23

18 X Y z i+13

~

-bull183635- 16

3 X Y z j 3

~

~l

----

NEIGHBORHOOD RELATIONSHIPS

TRIANGULATED POINTS

Fig 7 The GDS-first data structure Theillustration shows the points on the surfacewith their links to neighbors (edges) Theexternal representation is shown by theneighborhood relationships The internalrepresentation is composed of the point-fileand the pointer-file

two types (a) sets of irregularly-distri-buted points which were digitized with theunderstanding that every point is signif-icant and (b) sets of regularly or irregu-larly-distributed points where it is knownthat a number of points are redundant andcan be eliminated from the set eg regulargrids of points and encoded contours

The first type of data set is linked bysome type of triangulation At least twoapproaches exist The first (Dueppe and

Vol 2 No1 April 1975 65

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

by the perpendiculars Every point whichcontributes to the Thiessen polygon is aThiessen neighbor This procedure canagain be simplified by the assumption of alimited set of potential neighbors Thesame checking routines as above have tobe applied

Another approach which limits the num-ber of necessary tests but is mathemat-ically correct at the same time has beendeveloped within the project by KurtBrassel at Harvard University The pro-cedure is based on fields of potentialneighbors which converge very rapidly

The alternative to triangulation is three-dimensional generalization ie to selectfrom a set of points those which define thestructure with the least deviation from theoriginal surface The basic concept de-veloped by Randolph Franklin of HarvardUniversity and T K Peucker is to ap-proximate the surface of a series of tri-angles through a selected set of pointswhere each additional point included intothe set is the one which deviates the mostfrom the approximated triangles until thedeviations are below a given value (seePeucker 1974)

In both cases the triangulation and gen-eralization of the surface the result is alinked list of surface points The termlinked list means that points are linked withone another through pointers In otherwords a point is not only identified by its y z coordinates but also by a list of thelabels of the points which form edges oftriangles with the point In our case eachrecord consists of the t y z coordinates ofa point and a reference to the start of thepointers to the neighbors in a pointer listThe reason for not having the neighbor-hood pointers with the point record is thatthe number of points varies considerably(Mark 1974) Since the record has to belong enough to include all possible num-bers of neighbors large parts of the pointersections would be empty most of the timeThe pointers are sorted starting with thepointer the least East of North of a point(Fig 7)

The use of this type of data-structure isvery simple and efficient For every search

66

(profile contour etc) a criterion for edge-intersection is developed For contouringfor example the critical question iswhether one point of the edge is above thecontour level and the other below A startis found and one point of the edge is con-sidered a reference point and the other asubpoint The next subpoint is found bylooking up the next neighbor in the pointerlist If the test is positive the intersectionis performed and the process repeated Ifthe test is negative the reference and sub-points are switched and the process re-peated

Other procedures are equally simpleTo find a triangle for example one hasonly to have a reference and a subpointThe third point is the next label in thepointer list of the reference point after thesubpoint To find all triangles one goesthrough the total pointer list leaving outall those edges connecting reference pointswith subpoints with a smaller label sincethey would create triangles which hadbeen treated when the subpoint was areference point

Although the first data structure aspresented above seems to be efficient interms of storage capacity it does not pro-vide easy access to the data base which isoften very large I t is for this reasonthat we are developing the second datastructure to represent the general structureof the surface and to serve as a directoryto reach into the first data base (Pfaltz1975) (Fig 8)

The first step in the creation of thesecond data structure is to find the ridgechannel and break lines on the surface Ifone labels the highest point for every tri-angle the unlabeled points are membersof the channel line at least on smoothsurfaces A subsequent search routinedeals with the irregularities Points alongridges are found by eliminating the lowestpoint of every triangle using a routinedeveloped for a regular grid by D MDouglas The detection of break lines issomewhat more difficult

Some theoretical studies of surfaces byWarntz (1966) show that ridge lines andchannel lines cross at passes a useful point

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of information relative to the developmentof the second data structure Practicalconsiderations suggest that in terrain andother surfaces this regularity is not alwayspresent

Once the topological structure is at handit can be used as a directory into the firststructure The line is therefore treatedas a chain similar to the chain of thePOLYVRT system The nodes of thechain are the peaks passes pits and otherendpoints of chains on the surfaces Thesepoints are stored with their coordinatesand the names of the chains which termi-nate at the nodes The chains are storedwith the labels of the nodes and pointersinto the chain lists which consist of labelsof points in the first data structure (notethat here the chain structure differs fromthat of POLYVRT)

A third component of the GeographicData Structure should be mentioned sinceit illustrates very well the logical adapta-tion of a computer problem solution togeographical data The problem at handis the partitioning of the data set Sincewith large data sets only portions can bekept in fast memory the data base is seg-mented into pages which are broughtinto memory as units For the Geo-graphic Data Structure the paging sys-tem can solve several problems inherentin a complex geographic information sys-tem

The boundaries of patches as we callthe areal extension of a page are chainsalready defined for the second structureSince detail along the chain is of no topo-logical interest the density of points alongthe chain can differ for its two sides Inother words the density of triangles canchange from patch to patch This allowsfor very efficient data encoding even interrain with sudden changes in the sur-face behavior as at the change from amountainous area into a plain (Peucker1972)

Another advantage of the paging-systemis the ease of including topographic andplanar information Linking point lineand areal data to the triangulated pointswould lead to high definitional redundancy

Vol 2 No1 APril 1975

The secondary structure could lead lttoam-biguities where the terrain is very elon-gated Since an attempt has been made tokeep the shape of the patches as compactas possible the combination of non-terraindata with patch boundaries seems to bemost appropriate

Since the patch boundaries are againchains another virtue comes to light Thepatches can be treated as polygons of thePOLYVRT system with little difficultyThis link between the two systems lendshope that eventually they may be merged

It is an appropriate question to ask whatsuch a data structure as the GDS will beable to accomplish A number of displayroutines have already been developed(Cochrane 1974) and a series of proce-dures for surface analysis based on heuris-tic searches are underway (Fig 9) Sinceboth levels of data structure are graphswe will be able to rely on many of the de-velopments connected with operations re-search specifically network analysis forthe manipulative treatment of the data

As both systems GDS and GEOGRAFhave topological structures it is possibleto merge the two The creation of poly-gons from points is the major link fromthe GDS project to GEOGRAF Thecreation of a set of centroids for polygonsallows the conversion in the opposite direc-tion This way surfaces can be treated aspolygonal sets and can be displayed andmanipulated by the routines of GEO-GRAF Conversely polygonal data canbe treated as surfaces for GDS Theneighborhood routines are what make theproject useful in quantitative geographyand planning Neighborhood searches areextremely expensive without the topologi-cal data structure but they are usually amost important part of urban and environ-mental analyses once a general overview isobtained from the data

Although basic research and applicationdevelopment are two sides of one coin andmust go together Ito obtain lasting resultsthis paper has concentrated on the theo-retical parts of the project since their de-velopment is ahead of the application rou-tines a fact which should be expected

67

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

REGULAR GRID

~GENERALIZATION

TRIANGULATEDIRREGULAR GRID

First Data Structure

jNETWORK PROCEDURES

THEORy OFSURFACE SPECIFICPOINTS AND LINES

HIURfSTIC SEARCH

SEARCH

Profile SearchGeographic DisorientationOptimum SearchNeighborhood An~lysis

ContouringBlock DiagramShaded ContoursInclined ContoursVisibility TestRadar Image

Fig 9 The problem flow for GDS from the data base via the extraction of points andchains for the two types of data structures to their application

The quintessence of the research so far isthe hypothesis that topologically-structureddata bases of three-dimensional and planarsurfaces can result in reduced efforts inthe development and execution of applka-

REFERENCESAmidon E L and G S Aiken (1970) Al-

gorithmic Selection of the Best Method forCompressing Map Data Strings Communica-tions of the Association for Computillg Ma-chinery Vol 14 pp 769-774

68

tion routines We have some indicationthat the hypothesis is correct the real testwill come when the bulk of the applicationroutines is completed

Arms A (1970) MAPMODEL System Sys-tem Descriptioll and Users Guide Bureau ofGovernmental Research and Service Univer-sity of Oregon 59 pp

Boehm B M (1967) Tabular Representation

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of Multivariate Functions with Applicationsto Topographic Modelling Proceedings Z2ndNational Conference Association for Comput-ing Machinery pp 403-415

Brandstaetter L (1957) Exakte Schichtlinienund Topographische GelaendedarstellungOesterreichische Zeitschrift fuer VermesSlmgs-wesepound Sonderheft 18 Wien 90 pp

Cochrane Douglas (1974) Cartographic Dis-play Operations on Surface Data Collected inan Irregular Structure MA thesis SimonFraser University

Cooke D F and W F Maxfield (1967) TheDevelopment of a Geographic Base File andIts Uses for Mapping Urban and RegionalInformation Systems for Social Programs Pa-pers from the 5th Annual Conference of theUrban and Regional Information System As-sociation pp 207-218

Douglas David H (1973) BNDRYNETPeucker T K (ed) The Interactive Map inUrban Research Final Report After Year OneUniversity of British Columbia

Dueppe R D and H J Gottschalk (1970)Automatische Interpolation von Isolinien beiwillkuerlich verteilten Stuetzpunkten Allge-meine Vermessungs-Nachrichten Vol 129 No10 (Oct) pp 423-426

Grist M W (1972) Digital Ground ModelsAn Account of Recent Research Photogram-metric Record Vol 70 No4 (Oct) pp 424-441

Heiskanen W A and H Moritz (1967) Phys-ical Geodesy W H Freeman San FranciscoChapter 7

Holroyd M T and B K Bhattacharyya(1970) Automatic CO1poundtouring of GeoPhysicalData Using Riclbic Spline Interpolation Geo-logical Survey of Canada Paper No 70-55

Hsu M L et al (1975) Computer Applica-tions in Land Use Mapping and the MinnesotaLand Management Information System Da-vis ] C and M McCullagh (eds) Displayand Analysis of Spatial Data New York JohnWiley and Sons

Junkins ] L G M Miller and J R Jancaitis(1973) A Weighting Function Approach toModelling of Irregular Surfaces Journal ofGeophysical Research Vol 78 No 11 (Apr)pp 1794-1803

Kraus K (1973) A General Digital TerrainModel translation from an article in Acker-mann F (1973) Numerische Photogram-metrie Sammlung Wiechmann Neue FolgeBuchreihe Bd 5 Karlsruhe

Laboratory for Computer Graphics and Spatial

Vol 2 No1 April 1975

Analysis Harvard University (1974) POLY-VRT Manual Cambridge Mass

Mark David M (1974) A Comparison ofComputer-based Terrain Storage MethodsWith Respect to Evaluation of Certain Geo-morphometric Measures MA thesis Uni-versity of British Columbia

Morse S P (1968) A Mathematical Modelfor the Analysis of Contouring Line DataJournal of the Association for Computing Ma-chinery Vol 15 No2 pp 205-220

Nake F and T K Peucker (1972) The Inter-active Map in Urban Research Report afterYear One University of British Columbia

Nordbeck S and B Rystedt (1970) IsarithmicMaps and the Continuity of Reference IntervalFunctions Geografiska Annaler Vol 52 SerB pp 92-123

Peucker T K (1972) Computer CartographyAssociation of American Geographers CollegeGeography Commission Resource Paper No17 Washington DC

Peucker T K (ed) (1973) The InteractiveMap in Urban Research Final Report AfterYear One University of British Columbia

Peucker T K (1974) GeograPhical DataStructltres Report After Year One SimonFraser University

Pfaltz J L (1975) Surface Networks Geo-graPhical Analysis (forthcoming)

Rhynsburger D (1973) Analytic Delineationof Thiessen Polygons Geographical AnalysisVol 5 No2 (Apr) pp 133-144

Rosenfeld A (1969) Picture Processing byComputer New York Academic Press

Schmidt W (1969) The Automap SystemSurveying and Mapping Vol 29 No 1(Mar) pp 101-106

Shepard D (1968) A Two-Dimensional Inter-polation Function for Irregularly SpacedData Proceedings 23rd National ConferenceAssociation for Computing Machinery pp 517-524

Sima J (1972) Prinzipien des CS digitalenGelaendemodells Vermessungstechnik Vol20 No2 pp 48-51

Switzer P (1975) in Sampling of PlanarSurfaces Davis J C and M McCullagh(eds) DisPlay and Analysis of Spatial DataJohn Wiley and Sons New York

Tobler W R (1969) Geographical Filters andtheir Inverses GeograPhical Analysis Vol 1pp 234-253

Warntz W (1966) The Topology of Socio-Economic Terrain and Spatial Flows Papersof the Regional Science Association Vol 17pp47-61 bull

69

Page 8: +>KMIBK>JCD?,>M> :MKN?MNKALkclarke/Geography232/PeuckerChrisman.pdfDelivered by Publishing Technology to: University of California Santa Barbara Library IP: 128.111.234.232 on: Tue,

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

the reciprocal of this factor and the totalnumber of points is increased by p

The question has been raised many timesin photogral11t11etrywhether contours arethe best representatives of topographicsurfaces It has been stated that contoursdo not detect many types of breaks whichare frequent on terrain (eg Brandstatter1957) Therefore the encoding of sur-faces by vertical profiles has beenattemptedmiddot many times in photogrammetryin recent years Points are encoded onlywhen the slope of the surface changes(eg Kraus 1973) In another case thebreak lines are encoded in addition to aregular grid (Sima 1972) and in anotherapproach only the break lines or only theridges and channel lines of a surface areencoded (Grist 1972) The amount ofdetail with these approaches depends onthe scale used

vVhen performing numerical computa-tions on the basis of these digital terrainmodels the quantity of data involved willbe only one determining factor for theamount of programming and computationsneeded 11ost of the numerical computa-tions on surfaces require some type ofneighborhood function either to computesome surface behavior such as slope orlocal variation of relief or to find the nextunit for the drawing of a contour or avertical profile For a set of contours itis relatively easy to create a directorywhich indicates a sequence of contours ina type of tree in which the surroundingcontour is the base and the other contoursare the branches (Morse 1968) To findthe neighboring points on the two ad-jacent contours for a given point on anintermediate contour however one mustsearch through all the points on the twoadjacent contours and compute the dis-tances to the point in question This pro-cedure is time-consuming if the contourlines are represented by very many pointsA regular grid on the other hand has animplicit neighborhood function and findinga neighbor does not involve search norextra computer time For a set of irregu-larly-distributed points as they are repre-sented in very simple data structures (eg

62

SYMAP) the creation of a neighborhoodfunction is usually performed by findingthe closest small number of points wherethe number varies around six

THE GENERAL CASE

I t has been noted that although planesurfaces and three-dimensional surfaceslook rather different it takes only a fewassumptions to treat one as the subset ofthe other While three-dimensional sur-faces are always based on interval or ratiodata planar surfaces often involve ordinaland nominal data However it has beenshown that ordinal and nominal data maybe treated as interval data (N ordbeck andRystedt 1970 Rosenfeld 1969) and evenwithout this conversion we can combinethe two types into one general case bysimply using different assumptions aboutneighborhood

Given a set of n data points for whichx y and z coordinates are known con-tinuously defined surfaces (ie surfacesfor which one and only one value existsat every point) can be created using anyof three different assumptions about thesurface behavior (Peucker 1972) Thefirst assumption is that of a stepped sur-face which says that the surface retainsthe value of a data point within the neigh-borhood of that data point where neigh-borhood is defined either by a given poly-gon (the choropleth approach) or by thefact that the area is closer to one datapoint than to any other one in the neighbor-hood (the proximal approach) The sec-ond assumption is that each data pointrepresents a sample of a single value on aconstantly changing surface Neighbor-hood is then a number of closest neighbor-ing data points and intervening values areinterpolated with different types of in-terpolation procedures The third assump-tion is that the data point is a sample froma constantly changing surface that maycontain errors thus the data point is notnecessarily located on the surface butclose to it This approach implies thefurther assumption that the actual surfaceis smoother than the surface constructedthrough the sample data points

The American CartograPher

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

linFeatures

ChainGroup

vVith these assumptions about surfacebehavior any point or areal distributionof a variable can be treated as a continuousfunction z = f (x y) and planar and three-dimensional surfaces can be combined intoone type This has already been done insome computer progranls the most notablebeing SYMAP Many cartographic ap-plications must treat both types of sur-faces and therefore data structures mustbe developed which can handle both typesat one time

THE PROPOSED DATA STRUCTURES

Planar Surfaces

The need to incorporate different typesor hierarchies of polygons uncovered oneof the limiting assumptions made inPOLYVRT A chain plays a dual rolefirst it is the boundary of two areal enti-ties and secondly it is the unbroken unitof point retrieval This is not a problemif one is limited to a single nonoverlappingset of polygons The following proposednew system to bear the name GEOGRAFis an attempt at greater flexibility

Because of the addition of many layersof complexity involving multiple polygonsets the chain cannot remain to the samedegree the controlling object of the datastructure Just as the notion of an un-broken line is important so is the notionof an unpartitioned space In a systemwhich must handle overlapping polygonnetworks there is a need for a root ob-ject which is defined as an area uncut byany further partitioning This object istermed the Least Common GeographicUnit (LCGU) The LCGUs are con-structed as a POLYVRT polygon directlyfrom chains The relationship of theLCGDs to all other polygon types is hi-erarchical (Fig 6)

In turn the existence of the LCGUallows for the creation of each class ofpolygon In order to allow simple codingof the boundary relationships at thesehigher levels in the structure the chaingroup was devised A chain group is aset of chains which form a boundary of twoareal units for a given polygon classThese polygons are constructed of chain

0 2 No1 APril 1975

Cha DGroup === LiI I

EJ 8ha0ENodes -- -- LCGU 0

- - -LJ ~ 1

RFig 6 The relationship between the vari-ous parts of the data structure

groups which in turn are constructedfrom chains

Line features can be built up of chainsin the same manner as chain groupsNote that the chain group listing for eachlevel of polygon and each listing for a linefeature reference only the chains them-selves This allows each system to beconsidered as a separate directory whichis core-resident only when that class ofobjects is retrieved

The LCGU has other implications andapplications that are useful because of thetopological data structure The LCGUwith its coding for each of the polygonsets can be combined with contiguity in-formation of linear feature types to producean Attribute Cross Reference (A CR) The ACR is a table in which all objects(in polygon and linear systems) are cross-referenced to each other to determinenesting By using attributes of chains(lengths) and LCGUs (areas populationdensities etc) this cross-referencing cap-ability can assign a string of data collectedfor one polygon type to a string of asecond type

Topological manipulation routines arecentral to the success of this structureThe intersection of geographic featureswill rely on topological knowledge to re-alize economies of scale in processing largefiles All operations with lines will actu-ally work with bands built with endpoints

63

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of the line and the furthest deviants toboth sides (if the bands become too widethe lines are split etc) With thisapproach the windowing process uses non-linear windows (which is often the casewith geographic coordinates and map pro-jections) and becomes quite elegant Sim-ilarly intersection procedures such aspoint-in-polygon line-across-polygon andpolygon-over-polygon determination allowfor gainful application of the topologicalprinciple For point-in-polygon searcheschain groups are constructed which bisectthe universe into parts to sort the points(or nodes of polygons) into three groupsleft within the band and right Only thesecond group needs more detailed treat-ment The point set is then recursivelypartitioned until it reaches the level of theLeGDs For line-oriented problems thetopological connections of the two sets com-pared would allow intersections to belimited to immediate neighbors

Another important procedure will be anested chain-intersection routine Hereagain the chain band and its recursivesegmentation is used The number ofpoints which define a chain is constantlyincreased until the intersection test can bedetermined The search of a line througha set of polygons will use a graph-searchalgorithm developed for the GDS (Geo-graphic Data Structure) project discussedin the next section The neighborhoodsearch routines create records of neighborsfor every point or line or polygon at anyspecified depth of neighborhood

Three-Dimensional Surfaces

To implement the ideas presented herethe senior author is developing a geo-graphic information system for three-dimensional surfaces Both systems thethree-dimensional and the planar arebased on data structures with explicittopological neighborhood relationships

The basic philosophy of both approachesis to separate the data base from the appli-cation programs In the early days ofcomputer cartography a data set wasgenerally tied to the application programthat used it Now the available data have

64

become so voluminous and the applicationprograms so varied that extra efforts inthe preparation of data for more efficientcomputations seem to be justified We arewell in step with modern computer scienceto separate the data base from the applica-tion programs with the data structure be-coming the link between the two

The creation of a structured data baseis nothing new The interpolation of anirregular grid to a regular grid of height-points provides a data structure throughthe implicit neighborhood function of thegrid And in the case where polygons areindependently defined by a series of pointsthe neighborhood relationship is replacedby a search algorithm which finds theneighbors for every polygon by searchingfor matching segments in the boundaryfiles of the other polygons The idea is tospend computing time before any applica-tion has been performed in the anticipa-tion of heavy uses of the data base Anefficient structure of the data base shouldnot only speed up computations consider-ably but should also simplify the produc-tion of application programs

The data structure developed under theworking title Geographic Data Structure(GDS) is based on irregularly distributedpoints which are assumed to be samplepoints without sampling errors from asingle-valued surface Two types of struc-tures form the core of the data bases Thefirst creates neighborhood relationships bytriangulating the data set and storing forevery point the labels of all points whichare linked with the point by a triangleedge (Fig 7) The second structure isproduced by selecting those points of thesurfaces which lie along lines of high in-formation content such as ridges andchannel lines and defining them by theirnodes which are peaks passes and pits(Fig 8) This second data structureserves two purposes First ~t is a generalrepresentation of the surface for roughcomputations second it is a directoryinto the more detailed first structure

In the first structure the creation of theneighborhood relationship is based on theassumption that data-sets are usually of

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

34 16 3S 53 52 33 15

11 4 18 36 35 16 38

CHAINS

POINTE~

NODES

Fig 8 The GDS-second data structureBoth the node-file and the chain-file haveaccess to the first data structure the node-file directly and the chain-file through achain-painter-file

Gottschalk 1970) creates all possible linkschooses the shortest and eliminates alllinks which intersect the shortest Thisprocedure is repeated with the next short-est links until no links intersect The re-sult is the set of links with the minimumcumulative distance between neighboringpoints

The procedure has one disadvantagesince (~ ) links have to be created the num-ber of points is therefore limited toonly several hundred The first step inour approach therefore limits the links toa number of potential neighbors amongwhich the shortest link is chosen and in-tersected with all other links originatingin these potential neighbors This pro-cedure limits the number of tests for in-

nm2

tersections of links to less than -4-where m is the number of potential neigh-bors an arbitrary number between 8 and14 depending on the density variation ofthese points The procedure does notguarantee however that only triangles areconstructed polygons with more thanthree sides can result although they arerelatively rare The check for such poly-gons and their elimination is very easy andfast

The second possibility is to create atriangulated structure through use ofThiessen polygons A published solution(Rhynsburger 1973) intersects for everypoint the links to every other point midwayand chooses the smallest polygon created

~COIDJ-OIIIJI II POINTS POINTERS CHAIN POINTERSL Nm bull

[[]]-DI ~~u =

317 35 34 1516

Points Pcinters

POINTS POINTERSName x y bull Pointers

r--1735

~ i34

6 X Y z -- 15 -~ 17 X Y z i+6 23

18 X Y z i+13

~

-bull183635- 16

3 X Y z j 3

~

~l

----

NEIGHBORHOOD RELATIONSHIPS

TRIANGULATED POINTS

Fig 7 The GDS-first data structure Theillustration shows the points on the surfacewith their links to neighbors (edges) Theexternal representation is shown by theneighborhood relationships The internalrepresentation is composed of the point-fileand the pointer-file

two types (a) sets of irregularly-distri-buted points which were digitized with theunderstanding that every point is signif-icant and (b) sets of regularly or irregu-larly-distributed points where it is knownthat a number of points are redundant andcan be eliminated from the set eg regulargrids of points and encoded contours

The first type of data set is linked bysome type of triangulation At least twoapproaches exist The first (Dueppe and

Vol 2 No1 April 1975 65

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

by the perpendiculars Every point whichcontributes to the Thiessen polygon is aThiessen neighbor This procedure canagain be simplified by the assumption of alimited set of potential neighbors Thesame checking routines as above have tobe applied

Another approach which limits the num-ber of necessary tests but is mathemat-ically correct at the same time has beendeveloped within the project by KurtBrassel at Harvard University The pro-cedure is based on fields of potentialneighbors which converge very rapidly

The alternative to triangulation is three-dimensional generalization ie to selectfrom a set of points those which define thestructure with the least deviation from theoriginal surface The basic concept de-veloped by Randolph Franklin of HarvardUniversity and T K Peucker is to ap-proximate the surface of a series of tri-angles through a selected set of pointswhere each additional point included intothe set is the one which deviates the mostfrom the approximated triangles until thedeviations are below a given value (seePeucker 1974)

In both cases the triangulation and gen-eralization of the surface the result is alinked list of surface points The termlinked list means that points are linked withone another through pointers In otherwords a point is not only identified by its y z coordinates but also by a list of thelabels of the points which form edges oftriangles with the point In our case eachrecord consists of the t y z coordinates ofa point and a reference to the start of thepointers to the neighbors in a pointer listThe reason for not having the neighbor-hood pointers with the point record is thatthe number of points varies considerably(Mark 1974) Since the record has to belong enough to include all possible num-bers of neighbors large parts of the pointersections would be empty most of the timeThe pointers are sorted starting with thepointer the least East of North of a point(Fig 7)

The use of this type of data-structure isvery simple and efficient For every search

66

(profile contour etc) a criterion for edge-intersection is developed For contouringfor example the critical question iswhether one point of the edge is above thecontour level and the other below A startis found and one point of the edge is con-sidered a reference point and the other asubpoint The next subpoint is found bylooking up the next neighbor in the pointerlist If the test is positive the intersectionis performed and the process repeated Ifthe test is negative the reference and sub-points are switched and the process re-peated

Other procedures are equally simpleTo find a triangle for example one hasonly to have a reference and a subpointThe third point is the next label in thepointer list of the reference point after thesubpoint To find all triangles one goesthrough the total pointer list leaving outall those edges connecting reference pointswith subpoints with a smaller label sincethey would create triangles which hadbeen treated when the subpoint was areference point

Although the first data structure aspresented above seems to be efficient interms of storage capacity it does not pro-vide easy access to the data base which isoften very large I t is for this reasonthat we are developing the second datastructure to represent the general structureof the surface and to serve as a directoryto reach into the first data base (Pfaltz1975) (Fig 8)

The first step in the creation of thesecond data structure is to find the ridgechannel and break lines on the surface Ifone labels the highest point for every tri-angle the unlabeled points are membersof the channel line at least on smoothsurfaces A subsequent search routinedeals with the irregularities Points alongridges are found by eliminating the lowestpoint of every triangle using a routinedeveloped for a regular grid by D MDouglas The detection of break lines issomewhat more difficult

Some theoretical studies of surfaces byWarntz (1966) show that ridge lines andchannel lines cross at passes a useful point

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of information relative to the developmentof the second data structure Practicalconsiderations suggest that in terrain andother surfaces this regularity is not alwayspresent

Once the topological structure is at handit can be used as a directory into the firststructure The line is therefore treatedas a chain similar to the chain of thePOLYVRT system The nodes of thechain are the peaks passes pits and otherendpoints of chains on the surfaces Thesepoints are stored with their coordinatesand the names of the chains which termi-nate at the nodes The chains are storedwith the labels of the nodes and pointersinto the chain lists which consist of labelsof points in the first data structure (notethat here the chain structure differs fromthat of POLYVRT)

A third component of the GeographicData Structure should be mentioned sinceit illustrates very well the logical adapta-tion of a computer problem solution togeographical data The problem at handis the partitioning of the data set Sincewith large data sets only portions can bekept in fast memory the data base is seg-mented into pages which are broughtinto memory as units For the Geo-graphic Data Structure the paging sys-tem can solve several problems inherentin a complex geographic information sys-tem

The boundaries of patches as we callthe areal extension of a page are chainsalready defined for the second structureSince detail along the chain is of no topo-logical interest the density of points alongthe chain can differ for its two sides Inother words the density of triangles canchange from patch to patch This allowsfor very efficient data encoding even interrain with sudden changes in the sur-face behavior as at the change from amountainous area into a plain (Peucker1972)

Another advantage of the paging-systemis the ease of including topographic andplanar information Linking point lineand areal data to the triangulated pointswould lead to high definitional redundancy

Vol 2 No1 APril 1975

The secondary structure could lead lttoam-biguities where the terrain is very elon-gated Since an attempt has been made tokeep the shape of the patches as compactas possible the combination of non-terraindata with patch boundaries seems to bemost appropriate

Since the patch boundaries are againchains another virtue comes to light Thepatches can be treated as polygons of thePOLYVRT system with little difficultyThis link between the two systems lendshope that eventually they may be merged

It is an appropriate question to ask whatsuch a data structure as the GDS will beable to accomplish A number of displayroutines have already been developed(Cochrane 1974) and a series of proce-dures for surface analysis based on heuris-tic searches are underway (Fig 9) Sinceboth levels of data structure are graphswe will be able to rely on many of the de-velopments connected with operations re-search specifically network analysis forthe manipulative treatment of the data

As both systems GDS and GEOGRAFhave topological structures it is possibleto merge the two The creation of poly-gons from points is the major link fromthe GDS project to GEOGRAF Thecreation of a set of centroids for polygonsallows the conversion in the opposite direc-tion This way surfaces can be treated aspolygonal sets and can be displayed andmanipulated by the routines of GEO-GRAF Conversely polygonal data canbe treated as surfaces for GDS Theneighborhood routines are what make theproject useful in quantitative geographyand planning Neighborhood searches areextremely expensive without the topologi-cal data structure but they are usually amost important part of urban and environ-mental analyses once a general overview isobtained from the data

Although basic research and applicationdevelopment are two sides of one coin andmust go together Ito obtain lasting resultsthis paper has concentrated on the theo-retical parts of the project since their de-velopment is ahead of the application rou-tines a fact which should be expected

67

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

REGULAR GRID

~GENERALIZATION

TRIANGULATEDIRREGULAR GRID

First Data Structure

jNETWORK PROCEDURES

THEORy OFSURFACE SPECIFICPOINTS AND LINES

HIURfSTIC SEARCH

SEARCH

Profile SearchGeographic DisorientationOptimum SearchNeighborhood An~lysis

ContouringBlock DiagramShaded ContoursInclined ContoursVisibility TestRadar Image

Fig 9 The problem flow for GDS from the data base via the extraction of points andchains for the two types of data structures to their application

The quintessence of the research so far isthe hypothesis that topologically-structureddata bases of three-dimensional and planarsurfaces can result in reduced efforts inthe development and execution of applka-

REFERENCESAmidon E L and G S Aiken (1970) Al-

gorithmic Selection of the Best Method forCompressing Map Data Strings Communica-tions of the Association for Computillg Ma-chinery Vol 14 pp 769-774

68

tion routines We have some indicationthat the hypothesis is correct the real testwill come when the bulk of the applicationroutines is completed

Arms A (1970) MAPMODEL System Sys-tem Descriptioll and Users Guide Bureau ofGovernmental Research and Service Univer-sity of Oregon 59 pp

Boehm B M (1967) Tabular Representation

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of Multivariate Functions with Applicationsto Topographic Modelling Proceedings Z2ndNational Conference Association for Comput-ing Machinery pp 403-415

Brandstaetter L (1957) Exakte Schichtlinienund Topographische GelaendedarstellungOesterreichische Zeitschrift fuer VermesSlmgs-wesepound Sonderheft 18 Wien 90 pp

Cochrane Douglas (1974) Cartographic Dis-play Operations on Surface Data Collected inan Irregular Structure MA thesis SimonFraser University

Cooke D F and W F Maxfield (1967) TheDevelopment of a Geographic Base File andIts Uses for Mapping Urban and RegionalInformation Systems for Social Programs Pa-pers from the 5th Annual Conference of theUrban and Regional Information System As-sociation pp 207-218

Douglas David H (1973) BNDRYNETPeucker T K (ed) The Interactive Map inUrban Research Final Report After Year OneUniversity of British Columbia

Dueppe R D and H J Gottschalk (1970)Automatische Interpolation von Isolinien beiwillkuerlich verteilten Stuetzpunkten Allge-meine Vermessungs-Nachrichten Vol 129 No10 (Oct) pp 423-426

Grist M W (1972) Digital Ground ModelsAn Account of Recent Research Photogram-metric Record Vol 70 No4 (Oct) pp 424-441

Heiskanen W A and H Moritz (1967) Phys-ical Geodesy W H Freeman San FranciscoChapter 7

Holroyd M T and B K Bhattacharyya(1970) Automatic CO1poundtouring of GeoPhysicalData Using Riclbic Spline Interpolation Geo-logical Survey of Canada Paper No 70-55

Hsu M L et al (1975) Computer Applica-tions in Land Use Mapping and the MinnesotaLand Management Information System Da-vis ] C and M McCullagh (eds) Displayand Analysis of Spatial Data New York JohnWiley and Sons

Junkins ] L G M Miller and J R Jancaitis(1973) A Weighting Function Approach toModelling of Irregular Surfaces Journal ofGeophysical Research Vol 78 No 11 (Apr)pp 1794-1803

Kraus K (1973) A General Digital TerrainModel translation from an article in Acker-mann F (1973) Numerische Photogram-metrie Sammlung Wiechmann Neue FolgeBuchreihe Bd 5 Karlsruhe

Laboratory for Computer Graphics and Spatial

Vol 2 No1 April 1975

Analysis Harvard University (1974) POLY-VRT Manual Cambridge Mass

Mark David M (1974) A Comparison ofComputer-based Terrain Storage MethodsWith Respect to Evaluation of Certain Geo-morphometric Measures MA thesis Uni-versity of British Columbia

Morse S P (1968) A Mathematical Modelfor the Analysis of Contouring Line DataJournal of the Association for Computing Ma-chinery Vol 15 No2 pp 205-220

Nake F and T K Peucker (1972) The Inter-active Map in Urban Research Report afterYear One University of British Columbia

Nordbeck S and B Rystedt (1970) IsarithmicMaps and the Continuity of Reference IntervalFunctions Geografiska Annaler Vol 52 SerB pp 92-123

Peucker T K (1972) Computer CartographyAssociation of American Geographers CollegeGeography Commission Resource Paper No17 Washington DC

Peucker T K (ed) (1973) The InteractiveMap in Urban Research Final Report AfterYear One University of British Columbia

Peucker T K (1974) GeograPhical DataStructltres Report After Year One SimonFraser University

Pfaltz J L (1975) Surface Networks Geo-graPhical Analysis (forthcoming)

Rhynsburger D (1973) Analytic Delineationof Thiessen Polygons Geographical AnalysisVol 5 No2 (Apr) pp 133-144

Rosenfeld A (1969) Picture Processing byComputer New York Academic Press

Schmidt W (1969) The Automap SystemSurveying and Mapping Vol 29 No 1(Mar) pp 101-106

Shepard D (1968) A Two-Dimensional Inter-polation Function for Irregularly SpacedData Proceedings 23rd National ConferenceAssociation for Computing Machinery pp 517-524

Sima J (1972) Prinzipien des CS digitalenGelaendemodells Vermessungstechnik Vol20 No2 pp 48-51

Switzer P (1975) in Sampling of PlanarSurfaces Davis J C and M McCullagh(eds) DisPlay and Analysis of Spatial DataJohn Wiley and Sons New York

Tobler W R (1969) Geographical Filters andtheir Inverses GeograPhical Analysis Vol 1pp 234-253

Warntz W (1966) The Topology of Socio-Economic Terrain and Spatial Flows Papersof the Regional Science Association Vol 17pp47-61 bull

69

Page 9: +>KMIBK>JCD?,>M> :MKN?MNKALkclarke/Geography232/PeuckerChrisman.pdfDelivered by Publishing Technology to: University of California Santa Barbara Library IP: 128.111.234.232 on: Tue,

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

linFeatures

ChainGroup

vVith these assumptions about surfacebehavior any point or areal distributionof a variable can be treated as a continuousfunction z = f (x y) and planar and three-dimensional surfaces can be combined intoone type This has already been done insome computer progranls the most notablebeing SYMAP Many cartographic ap-plications must treat both types of sur-faces and therefore data structures mustbe developed which can handle both typesat one time

THE PROPOSED DATA STRUCTURES

Planar Surfaces

The need to incorporate different typesor hierarchies of polygons uncovered oneof the limiting assumptions made inPOLYVRT A chain plays a dual rolefirst it is the boundary of two areal enti-ties and secondly it is the unbroken unitof point retrieval This is not a problemif one is limited to a single nonoverlappingset of polygons The following proposednew system to bear the name GEOGRAFis an attempt at greater flexibility

Because of the addition of many layersof complexity involving multiple polygonsets the chain cannot remain to the samedegree the controlling object of the datastructure Just as the notion of an un-broken line is important so is the notionof an unpartitioned space In a systemwhich must handle overlapping polygonnetworks there is a need for a root ob-ject which is defined as an area uncut byany further partitioning This object istermed the Least Common GeographicUnit (LCGU) The LCGUs are con-structed as a POLYVRT polygon directlyfrom chains The relationship of theLCGDs to all other polygon types is hi-erarchical (Fig 6)

In turn the existence of the LCGUallows for the creation of each class ofpolygon In order to allow simple codingof the boundary relationships at thesehigher levels in the structure the chaingroup was devised A chain group is aset of chains which form a boundary of twoareal units for a given polygon classThese polygons are constructed of chain

0 2 No1 APril 1975

Cha DGroup === LiI I

EJ 8ha0ENodes -- -- LCGU 0

- - -LJ ~ 1

RFig 6 The relationship between the vari-ous parts of the data structure

groups which in turn are constructedfrom chains

Line features can be built up of chainsin the same manner as chain groupsNote that the chain group listing for eachlevel of polygon and each listing for a linefeature reference only the chains them-selves This allows each system to beconsidered as a separate directory whichis core-resident only when that class ofobjects is retrieved

The LCGU has other implications andapplications that are useful because of thetopological data structure The LCGUwith its coding for each of the polygonsets can be combined with contiguity in-formation of linear feature types to producean Attribute Cross Reference (A CR) The ACR is a table in which all objects(in polygon and linear systems) are cross-referenced to each other to determinenesting By using attributes of chains(lengths) and LCGUs (areas populationdensities etc) this cross-referencing cap-ability can assign a string of data collectedfor one polygon type to a string of asecond type

Topological manipulation routines arecentral to the success of this structureThe intersection of geographic featureswill rely on topological knowledge to re-alize economies of scale in processing largefiles All operations with lines will actu-ally work with bands built with endpoints

63

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of the line and the furthest deviants toboth sides (if the bands become too widethe lines are split etc) With thisapproach the windowing process uses non-linear windows (which is often the casewith geographic coordinates and map pro-jections) and becomes quite elegant Sim-ilarly intersection procedures such aspoint-in-polygon line-across-polygon andpolygon-over-polygon determination allowfor gainful application of the topologicalprinciple For point-in-polygon searcheschain groups are constructed which bisectthe universe into parts to sort the points(or nodes of polygons) into three groupsleft within the band and right Only thesecond group needs more detailed treat-ment The point set is then recursivelypartitioned until it reaches the level of theLeGDs For line-oriented problems thetopological connections of the two sets com-pared would allow intersections to belimited to immediate neighbors

Another important procedure will be anested chain-intersection routine Hereagain the chain band and its recursivesegmentation is used The number ofpoints which define a chain is constantlyincreased until the intersection test can bedetermined The search of a line througha set of polygons will use a graph-searchalgorithm developed for the GDS (Geo-graphic Data Structure) project discussedin the next section The neighborhoodsearch routines create records of neighborsfor every point or line or polygon at anyspecified depth of neighborhood

Three-Dimensional Surfaces

To implement the ideas presented herethe senior author is developing a geo-graphic information system for three-dimensional surfaces Both systems thethree-dimensional and the planar arebased on data structures with explicittopological neighborhood relationships

The basic philosophy of both approachesis to separate the data base from the appli-cation programs In the early days ofcomputer cartography a data set wasgenerally tied to the application programthat used it Now the available data have

64

become so voluminous and the applicationprograms so varied that extra efforts inthe preparation of data for more efficientcomputations seem to be justified We arewell in step with modern computer scienceto separate the data base from the applica-tion programs with the data structure be-coming the link between the two

The creation of a structured data baseis nothing new The interpolation of anirregular grid to a regular grid of height-points provides a data structure throughthe implicit neighborhood function of thegrid And in the case where polygons areindependently defined by a series of pointsthe neighborhood relationship is replacedby a search algorithm which finds theneighbors for every polygon by searchingfor matching segments in the boundaryfiles of the other polygons The idea is tospend computing time before any applica-tion has been performed in the anticipa-tion of heavy uses of the data base Anefficient structure of the data base shouldnot only speed up computations consider-ably but should also simplify the produc-tion of application programs

The data structure developed under theworking title Geographic Data Structure(GDS) is based on irregularly distributedpoints which are assumed to be samplepoints without sampling errors from asingle-valued surface Two types of struc-tures form the core of the data bases Thefirst creates neighborhood relationships bytriangulating the data set and storing forevery point the labels of all points whichare linked with the point by a triangleedge (Fig 7) The second structure isproduced by selecting those points of thesurfaces which lie along lines of high in-formation content such as ridges andchannel lines and defining them by theirnodes which are peaks passes and pits(Fig 8) This second data structureserves two purposes First ~t is a generalrepresentation of the surface for roughcomputations second it is a directoryinto the more detailed first structure

In the first structure the creation of theneighborhood relationship is based on theassumption that data-sets are usually of

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

34 16 3S 53 52 33 15

11 4 18 36 35 16 38

CHAINS

POINTE~

NODES

Fig 8 The GDS-second data structureBoth the node-file and the chain-file haveaccess to the first data structure the node-file directly and the chain-file through achain-painter-file

Gottschalk 1970) creates all possible linkschooses the shortest and eliminates alllinks which intersect the shortest Thisprocedure is repeated with the next short-est links until no links intersect The re-sult is the set of links with the minimumcumulative distance between neighboringpoints

The procedure has one disadvantagesince (~ ) links have to be created the num-ber of points is therefore limited toonly several hundred The first step inour approach therefore limits the links toa number of potential neighbors amongwhich the shortest link is chosen and in-tersected with all other links originatingin these potential neighbors This pro-cedure limits the number of tests for in-

nm2

tersections of links to less than -4-where m is the number of potential neigh-bors an arbitrary number between 8 and14 depending on the density variation ofthese points The procedure does notguarantee however that only triangles areconstructed polygons with more thanthree sides can result although they arerelatively rare The check for such poly-gons and their elimination is very easy andfast

The second possibility is to create atriangulated structure through use ofThiessen polygons A published solution(Rhynsburger 1973) intersects for everypoint the links to every other point midwayand chooses the smallest polygon created

~COIDJ-OIIIJI II POINTS POINTERS CHAIN POINTERSL Nm bull

[[]]-DI ~~u =

317 35 34 1516

Points Pcinters

POINTS POINTERSName x y bull Pointers

r--1735

~ i34

6 X Y z -- 15 -~ 17 X Y z i+6 23

18 X Y z i+13

~

-bull183635- 16

3 X Y z j 3

~

~l

----

NEIGHBORHOOD RELATIONSHIPS

TRIANGULATED POINTS

Fig 7 The GDS-first data structure Theillustration shows the points on the surfacewith their links to neighbors (edges) Theexternal representation is shown by theneighborhood relationships The internalrepresentation is composed of the point-fileand the pointer-file

two types (a) sets of irregularly-distri-buted points which were digitized with theunderstanding that every point is signif-icant and (b) sets of regularly or irregu-larly-distributed points where it is knownthat a number of points are redundant andcan be eliminated from the set eg regulargrids of points and encoded contours

The first type of data set is linked bysome type of triangulation At least twoapproaches exist The first (Dueppe and

Vol 2 No1 April 1975 65

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

by the perpendiculars Every point whichcontributes to the Thiessen polygon is aThiessen neighbor This procedure canagain be simplified by the assumption of alimited set of potential neighbors Thesame checking routines as above have tobe applied

Another approach which limits the num-ber of necessary tests but is mathemat-ically correct at the same time has beendeveloped within the project by KurtBrassel at Harvard University The pro-cedure is based on fields of potentialneighbors which converge very rapidly

The alternative to triangulation is three-dimensional generalization ie to selectfrom a set of points those which define thestructure with the least deviation from theoriginal surface The basic concept de-veloped by Randolph Franklin of HarvardUniversity and T K Peucker is to ap-proximate the surface of a series of tri-angles through a selected set of pointswhere each additional point included intothe set is the one which deviates the mostfrom the approximated triangles until thedeviations are below a given value (seePeucker 1974)

In both cases the triangulation and gen-eralization of the surface the result is alinked list of surface points The termlinked list means that points are linked withone another through pointers In otherwords a point is not only identified by its y z coordinates but also by a list of thelabels of the points which form edges oftriangles with the point In our case eachrecord consists of the t y z coordinates ofa point and a reference to the start of thepointers to the neighbors in a pointer listThe reason for not having the neighbor-hood pointers with the point record is thatthe number of points varies considerably(Mark 1974) Since the record has to belong enough to include all possible num-bers of neighbors large parts of the pointersections would be empty most of the timeThe pointers are sorted starting with thepointer the least East of North of a point(Fig 7)

The use of this type of data-structure isvery simple and efficient For every search

66

(profile contour etc) a criterion for edge-intersection is developed For contouringfor example the critical question iswhether one point of the edge is above thecontour level and the other below A startis found and one point of the edge is con-sidered a reference point and the other asubpoint The next subpoint is found bylooking up the next neighbor in the pointerlist If the test is positive the intersectionis performed and the process repeated Ifthe test is negative the reference and sub-points are switched and the process re-peated

Other procedures are equally simpleTo find a triangle for example one hasonly to have a reference and a subpointThe third point is the next label in thepointer list of the reference point after thesubpoint To find all triangles one goesthrough the total pointer list leaving outall those edges connecting reference pointswith subpoints with a smaller label sincethey would create triangles which hadbeen treated when the subpoint was areference point

Although the first data structure aspresented above seems to be efficient interms of storage capacity it does not pro-vide easy access to the data base which isoften very large I t is for this reasonthat we are developing the second datastructure to represent the general structureof the surface and to serve as a directoryto reach into the first data base (Pfaltz1975) (Fig 8)

The first step in the creation of thesecond data structure is to find the ridgechannel and break lines on the surface Ifone labels the highest point for every tri-angle the unlabeled points are membersof the channel line at least on smoothsurfaces A subsequent search routinedeals with the irregularities Points alongridges are found by eliminating the lowestpoint of every triangle using a routinedeveloped for a regular grid by D MDouglas The detection of break lines issomewhat more difficult

Some theoretical studies of surfaces byWarntz (1966) show that ridge lines andchannel lines cross at passes a useful point

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of information relative to the developmentof the second data structure Practicalconsiderations suggest that in terrain andother surfaces this regularity is not alwayspresent

Once the topological structure is at handit can be used as a directory into the firststructure The line is therefore treatedas a chain similar to the chain of thePOLYVRT system The nodes of thechain are the peaks passes pits and otherendpoints of chains on the surfaces Thesepoints are stored with their coordinatesand the names of the chains which termi-nate at the nodes The chains are storedwith the labels of the nodes and pointersinto the chain lists which consist of labelsof points in the first data structure (notethat here the chain structure differs fromthat of POLYVRT)

A third component of the GeographicData Structure should be mentioned sinceit illustrates very well the logical adapta-tion of a computer problem solution togeographical data The problem at handis the partitioning of the data set Sincewith large data sets only portions can bekept in fast memory the data base is seg-mented into pages which are broughtinto memory as units For the Geo-graphic Data Structure the paging sys-tem can solve several problems inherentin a complex geographic information sys-tem

The boundaries of patches as we callthe areal extension of a page are chainsalready defined for the second structureSince detail along the chain is of no topo-logical interest the density of points alongthe chain can differ for its two sides Inother words the density of triangles canchange from patch to patch This allowsfor very efficient data encoding even interrain with sudden changes in the sur-face behavior as at the change from amountainous area into a plain (Peucker1972)

Another advantage of the paging-systemis the ease of including topographic andplanar information Linking point lineand areal data to the triangulated pointswould lead to high definitional redundancy

Vol 2 No1 APril 1975

The secondary structure could lead lttoam-biguities where the terrain is very elon-gated Since an attempt has been made tokeep the shape of the patches as compactas possible the combination of non-terraindata with patch boundaries seems to bemost appropriate

Since the patch boundaries are againchains another virtue comes to light Thepatches can be treated as polygons of thePOLYVRT system with little difficultyThis link between the two systems lendshope that eventually they may be merged

It is an appropriate question to ask whatsuch a data structure as the GDS will beable to accomplish A number of displayroutines have already been developed(Cochrane 1974) and a series of proce-dures for surface analysis based on heuris-tic searches are underway (Fig 9) Sinceboth levels of data structure are graphswe will be able to rely on many of the de-velopments connected with operations re-search specifically network analysis forthe manipulative treatment of the data

As both systems GDS and GEOGRAFhave topological structures it is possibleto merge the two The creation of poly-gons from points is the major link fromthe GDS project to GEOGRAF Thecreation of a set of centroids for polygonsallows the conversion in the opposite direc-tion This way surfaces can be treated aspolygonal sets and can be displayed andmanipulated by the routines of GEO-GRAF Conversely polygonal data canbe treated as surfaces for GDS Theneighborhood routines are what make theproject useful in quantitative geographyand planning Neighborhood searches areextremely expensive without the topologi-cal data structure but they are usually amost important part of urban and environ-mental analyses once a general overview isobtained from the data

Although basic research and applicationdevelopment are two sides of one coin andmust go together Ito obtain lasting resultsthis paper has concentrated on the theo-retical parts of the project since their de-velopment is ahead of the application rou-tines a fact which should be expected

67

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

REGULAR GRID

~GENERALIZATION

TRIANGULATEDIRREGULAR GRID

First Data Structure

jNETWORK PROCEDURES

THEORy OFSURFACE SPECIFICPOINTS AND LINES

HIURfSTIC SEARCH

SEARCH

Profile SearchGeographic DisorientationOptimum SearchNeighborhood An~lysis

ContouringBlock DiagramShaded ContoursInclined ContoursVisibility TestRadar Image

Fig 9 The problem flow for GDS from the data base via the extraction of points andchains for the two types of data structures to their application

The quintessence of the research so far isthe hypothesis that topologically-structureddata bases of three-dimensional and planarsurfaces can result in reduced efforts inthe development and execution of applka-

REFERENCESAmidon E L and G S Aiken (1970) Al-

gorithmic Selection of the Best Method forCompressing Map Data Strings Communica-tions of the Association for Computillg Ma-chinery Vol 14 pp 769-774

68

tion routines We have some indicationthat the hypothesis is correct the real testwill come when the bulk of the applicationroutines is completed

Arms A (1970) MAPMODEL System Sys-tem Descriptioll and Users Guide Bureau ofGovernmental Research and Service Univer-sity of Oregon 59 pp

Boehm B M (1967) Tabular Representation

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of Multivariate Functions with Applicationsto Topographic Modelling Proceedings Z2ndNational Conference Association for Comput-ing Machinery pp 403-415

Brandstaetter L (1957) Exakte Schichtlinienund Topographische GelaendedarstellungOesterreichische Zeitschrift fuer VermesSlmgs-wesepound Sonderheft 18 Wien 90 pp

Cochrane Douglas (1974) Cartographic Dis-play Operations on Surface Data Collected inan Irregular Structure MA thesis SimonFraser University

Cooke D F and W F Maxfield (1967) TheDevelopment of a Geographic Base File andIts Uses for Mapping Urban and RegionalInformation Systems for Social Programs Pa-pers from the 5th Annual Conference of theUrban and Regional Information System As-sociation pp 207-218

Douglas David H (1973) BNDRYNETPeucker T K (ed) The Interactive Map inUrban Research Final Report After Year OneUniversity of British Columbia

Dueppe R D and H J Gottschalk (1970)Automatische Interpolation von Isolinien beiwillkuerlich verteilten Stuetzpunkten Allge-meine Vermessungs-Nachrichten Vol 129 No10 (Oct) pp 423-426

Grist M W (1972) Digital Ground ModelsAn Account of Recent Research Photogram-metric Record Vol 70 No4 (Oct) pp 424-441

Heiskanen W A and H Moritz (1967) Phys-ical Geodesy W H Freeman San FranciscoChapter 7

Holroyd M T and B K Bhattacharyya(1970) Automatic CO1poundtouring of GeoPhysicalData Using Riclbic Spline Interpolation Geo-logical Survey of Canada Paper No 70-55

Hsu M L et al (1975) Computer Applica-tions in Land Use Mapping and the MinnesotaLand Management Information System Da-vis ] C and M McCullagh (eds) Displayand Analysis of Spatial Data New York JohnWiley and Sons

Junkins ] L G M Miller and J R Jancaitis(1973) A Weighting Function Approach toModelling of Irregular Surfaces Journal ofGeophysical Research Vol 78 No 11 (Apr)pp 1794-1803

Kraus K (1973) A General Digital TerrainModel translation from an article in Acker-mann F (1973) Numerische Photogram-metrie Sammlung Wiechmann Neue FolgeBuchreihe Bd 5 Karlsruhe

Laboratory for Computer Graphics and Spatial

Vol 2 No1 April 1975

Analysis Harvard University (1974) POLY-VRT Manual Cambridge Mass

Mark David M (1974) A Comparison ofComputer-based Terrain Storage MethodsWith Respect to Evaluation of Certain Geo-morphometric Measures MA thesis Uni-versity of British Columbia

Morse S P (1968) A Mathematical Modelfor the Analysis of Contouring Line DataJournal of the Association for Computing Ma-chinery Vol 15 No2 pp 205-220

Nake F and T K Peucker (1972) The Inter-active Map in Urban Research Report afterYear One University of British Columbia

Nordbeck S and B Rystedt (1970) IsarithmicMaps and the Continuity of Reference IntervalFunctions Geografiska Annaler Vol 52 SerB pp 92-123

Peucker T K (1972) Computer CartographyAssociation of American Geographers CollegeGeography Commission Resource Paper No17 Washington DC

Peucker T K (ed) (1973) The InteractiveMap in Urban Research Final Report AfterYear One University of British Columbia

Peucker T K (1974) GeograPhical DataStructltres Report After Year One SimonFraser University

Pfaltz J L (1975) Surface Networks Geo-graPhical Analysis (forthcoming)

Rhynsburger D (1973) Analytic Delineationof Thiessen Polygons Geographical AnalysisVol 5 No2 (Apr) pp 133-144

Rosenfeld A (1969) Picture Processing byComputer New York Academic Press

Schmidt W (1969) The Automap SystemSurveying and Mapping Vol 29 No 1(Mar) pp 101-106

Shepard D (1968) A Two-Dimensional Inter-polation Function for Irregularly SpacedData Proceedings 23rd National ConferenceAssociation for Computing Machinery pp 517-524

Sima J (1972) Prinzipien des CS digitalenGelaendemodells Vermessungstechnik Vol20 No2 pp 48-51

Switzer P (1975) in Sampling of PlanarSurfaces Davis J C and M McCullagh(eds) DisPlay and Analysis of Spatial DataJohn Wiley and Sons New York

Tobler W R (1969) Geographical Filters andtheir Inverses GeograPhical Analysis Vol 1pp 234-253

Warntz W (1966) The Topology of Socio-Economic Terrain and Spatial Flows Papersof the Regional Science Association Vol 17pp47-61 bull

69

Page 10: +>KMIBK>JCD?,>M> :MKN?MNKALkclarke/Geography232/PeuckerChrisman.pdfDelivered by Publishing Technology to: University of California Santa Barbara Library IP: 128.111.234.232 on: Tue,

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of the line and the furthest deviants toboth sides (if the bands become too widethe lines are split etc) With thisapproach the windowing process uses non-linear windows (which is often the casewith geographic coordinates and map pro-jections) and becomes quite elegant Sim-ilarly intersection procedures such aspoint-in-polygon line-across-polygon andpolygon-over-polygon determination allowfor gainful application of the topologicalprinciple For point-in-polygon searcheschain groups are constructed which bisectthe universe into parts to sort the points(or nodes of polygons) into three groupsleft within the band and right Only thesecond group needs more detailed treat-ment The point set is then recursivelypartitioned until it reaches the level of theLeGDs For line-oriented problems thetopological connections of the two sets com-pared would allow intersections to belimited to immediate neighbors

Another important procedure will be anested chain-intersection routine Hereagain the chain band and its recursivesegmentation is used The number ofpoints which define a chain is constantlyincreased until the intersection test can bedetermined The search of a line througha set of polygons will use a graph-searchalgorithm developed for the GDS (Geo-graphic Data Structure) project discussedin the next section The neighborhoodsearch routines create records of neighborsfor every point or line or polygon at anyspecified depth of neighborhood

Three-Dimensional Surfaces

To implement the ideas presented herethe senior author is developing a geo-graphic information system for three-dimensional surfaces Both systems thethree-dimensional and the planar arebased on data structures with explicittopological neighborhood relationships

The basic philosophy of both approachesis to separate the data base from the appli-cation programs In the early days ofcomputer cartography a data set wasgenerally tied to the application programthat used it Now the available data have

64

become so voluminous and the applicationprograms so varied that extra efforts inthe preparation of data for more efficientcomputations seem to be justified We arewell in step with modern computer scienceto separate the data base from the applica-tion programs with the data structure be-coming the link between the two

The creation of a structured data baseis nothing new The interpolation of anirregular grid to a regular grid of height-points provides a data structure throughthe implicit neighborhood function of thegrid And in the case where polygons areindependently defined by a series of pointsthe neighborhood relationship is replacedby a search algorithm which finds theneighbors for every polygon by searchingfor matching segments in the boundaryfiles of the other polygons The idea is tospend computing time before any applica-tion has been performed in the anticipa-tion of heavy uses of the data base Anefficient structure of the data base shouldnot only speed up computations consider-ably but should also simplify the produc-tion of application programs

The data structure developed under theworking title Geographic Data Structure(GDS) is based on irregularly distributedpoints which are assumed to be samplepoints without sampling errors from asingle-valued surface Two types of struc-tures form the core of the data bases Thefirst creates neighborhood relationships bytriangulating the data set and storing forevery point the labels of all points whichare linked with the point by a triangleedge (Fig 7) The second structure isproduced by selecting those points of thesurfaces which lie along lines of high in-formation content such as ridges andchannel lines and defining them by theirnodes which are peaks passes and pits(Fig 8) This second data structureserves two purposes First ~t is a generalrepresentation of the surface for roughcomputations second it is a directoryinto the more detailed first structure

In the first structure the creation of theneighborhood relationship is based on theassumption that data-sets are usually of

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

34 16 3S 53 52 33 15

11 4 18 36 35 16 38

CHAINS

POINTE~

NODES

Fig 8 The GDS-second data structureBoth the node-file and the chain-file haveaccess to the first data structure the node-file directly and the chain-file through achain-painter-file

Gottschalk 1970) creates all possible linkschooses the shortest and eliminates alllinks which intersect the shortest Thisprocedure is repeated with the next short-est links until no links intersect The re-sult is the set of links with the minimumcumulative distance between neighboringpoints

The procedure has one disadvantagesince (~ ) links have to be created the num-ber of points is therefore limited toonly several hundred The first step inour approach therefore limits the links toa number of potential neighbors amongwhich the shortest link is chosen and in-tersected with all other links originatingin these potential neighbors This pro-cedure limits the number of tests for in-

nm2

tersections of links to less than -4-where m is the number of potential neigh-bors an arbitrary number between 8 and14 depending on the density variation ofthese points The procedure does notguarantee however that only triangles areconstructed polygons with more thanthree sides can result although they arerelatively rare The check for such poly-gons and their elimination is very easy andfast

The second possibility is to create atriangulated structure through use ofThiessen polygons A published solution(Rhynsburger 1973) intersects for everypoint the links to every other point midwayand chooses the smallest polygon created

~COIDJ-OIIIJI II POINTS POINTERS CHAIN POINTERSL Nm bull

[[]]-DI ~~u =

317 35 34 1516

Points Pcinters

POINTS POINTERSName x y bull Pointers

r--1735

~ i34

6 X Y z -- 15 -~ 17 X Y z i+6 23

18 X Y z i+13

~

-bull183635- 16

3 X Y z j 3

~

~l

----

NEIGHBORHOOD RELATIONSHIPS

TRIANGULATED POINTS

Fig 7 The GDS-first data structure Theillustration shows the points on the surfacewith their links to neighbors (edges) Theexternal representation is shown by theneighborhood relationships The internalrepresentation is composed of the point-fileand the pointer-file

two types (a) sets of irregularly-distri-buted points which were digitized with theunderstanding that every point is signif-icant and (b) sets of regularly or irregu-larly-distributed points where it is knownthat a number of points are redundant andcan be eliminated from the set eg regulargrids of points and encoded contours

The first type of data set is linked bysome type of triangulation At least twoapproaches exist The first (Dueppe and

Vol 2 No1 April 1975 65

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

by the perpendiculars Every point whichcontributes to the Thiessen polygon is aThiessen neighbor This procedure canagain be simplified by the assumption of alimited set of potential neighbors Thesame checking routines as above have tobe applied

Another approach which limits the num-ber of necessary tests but is mathemat-ically correct at the same time has beendeveloped within the project by KurtBrassel at Harvard University The pro-cedure is based on fields of potentialneighbors which converge very rapidly

The alternative to triangulation is three-dimensional generalization ie to selectfrom a set of points those which define thestructure with the least deviation from theoriginal surface The basic concept de-veloped by Randolph Franklin of HarvardUniversity and T K Peucker is to ap-proximate the surface of a series of tri-angles through a selected set of pointswhere each additional point included intothe set is the one which deviates the mostfrom the approximated triangles until thedeviations are below a given value (seePeucker 1974)

In both cases the triangulation and gen-eralization of the surface the result is alinked list of surface points The termlinked list means that points are linked withone another through pointers In otherwords a point is not only identified by its y z coordinates but also by a list of thelabels of the points which form edges oftriangles with the point In our case eachrecord consists of the t y z coordinates ofa point and a reference to the start of thepointers to the neighbors in a pointer listThe reason for not having the neighbor-hood pointers with the point record is thatthe number of points varies considerably(Mark 1974) Since the record has to belong enough to include all possible num-bers of neighbors large parts of the pointersections would be empty most of the timeThe pointers are sorted starting with thepointer the least East of North of a point(Fig 7)

The use of this type of data-structure isvery simple and efficient For every search

66

(profile contour etc) a criterion for edge-intersection is developed For contouringfor example the critical question iswhether one point of the edge is above thecontour level and the other below A startis found and one point of the edge is con-sidered a reference point and the other asubpoint The next subpoint is found bylooking up the next neighbor in the pointerlist If the test is positive the intersectionis performed and the process repeated Ifthe test is negative the reference and sub-points are switched and the process re-peated

Other procedures are equally simpleTo find a triangle for example one hasonly to have a reference and a subpointThe third point is the next label in thepointer list of the reference point after thesubpoint To find all triangles one goesthrough the total pointer list leaving outall those edges connecting reference pointswith subpoints with a smaller label sincethey would create triangles which hadbeen treated when the subpoint was areference point

Although the first data structure aspresented above seems to be efficient interms of storage capacity it does not pro-vide easy access to the data base which isoften very large I t is for this reasonthat we are developing the second datastructure to represent the general structureof the surface and to serve as a directoryto reach into the first data base (Pfaltz1975) (Fig 8)

The first step in the creation of thesecond data structure is to find the ridgechannel and break lines on the surface Ifone labels the highest point for every tri-angle the unlabeled points are membersof the channel line at least on smoothsurfaces A subsequent search routinedeals with the irregularities Points alongridges are found by eliminating the lowestpoint of every triangle using a routinedeveloped for a regular grid by D MDouglas The detection of break lines issomewhat more difficult

Some theoretical studies of surfaces byWarntz (1966) show that ridge lines andchannel lines cross at passes a useful point

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of information relative to the developmentof the second data structure Practicalconsiderations suggest that in terrain andother surfaces this regularity is not alwayspresent

Once the topological structure is at handit can be used as a directory into the firststructure The line is therefore treatedas a chain similar to the chain of thePOLYVRT system The nodes of thechain are the peaks passes pits and otherendpoints of chains on the surfaces Thesepoints are stored with their coordinatesand the names of the chains which termi-nate at the nodes The chains are storedwith the labels of the nodes and pointersinto the chain lists which consist of labelsof points in the first data structure (notethat here the chain structure differs fromthat of POLYVRT)

A third component of the GeographicData Structure should be mentioned sinceit illustrates very well the logical adapta-tion of a computer problem solution togeographical data The problem at handis the partitioning of the data set Sincewith large data sets only portions can bekept in fast memory the data base is seg-mented into pages which are broughtinto memory as units For the Geo-graphic Data Structure the paging sys-tem can solve several problems inherentin a complex geographic information sys-tem

The boundaries of patches as we callthe areal extension of a page are chainsalready defined for the second structureSince detail along the chain is of no topo-logical interest the density of points alongthe chain can differ for its two sides Inother words the density of triangles canchange from patch to patch This allowsfor very efficient data encoding even interrain with sudden changes in the sur-face behavior as at the change from amountainous area into a plain (Peucker1972)

Another advantage of the paging-systemis the ease of including topographic andplanar information Linking point lineand areal data to the triangulated pointswould lead to high definitional redundancy

Vol 2 No1 APril 1975

The secondary structure could lead lttoam-biguities where the terrain is very elon-gated Since an attempt has been made tokeep the shape of the patches as compactas possible the combination of non-terraindata with patch boundaries seems to bemost appropriate

Since the patch boundaries are againchains another virtue comes to light Thepatches can be treated as polygons of thePOLYVRT system with little difficultyThis link between the two systems lendshope that eventually they may be merged

It is an appropriate question to ask whatsuch a data structure as the GDS will beable to accomplish A number of displayroutines have already been developed(Cochrane 1974) and a series of proce-dures for surface analysis based on heuris-tic searches are underway (Fig 9) Sinceboth levels of data structure are graphswe will be able to rely on many of the de-velopments connected with operations re-search specifically network analysis forthe manipulative treatment of the data

As both systems GDS and GEOGRAFhave topological structures it is possibleto merge the two The creation of poly-gons from points is the major link fromthe GDS project to GEOGRAF Thecreation of a set of centroids for polygonsallows the conversion in the opposite direc-tion This way surfaces can be treated aspolygonal sets and can be displayed andmanipulated by the routines of GEO-GRAF Conversely polygonal data canbe treated as surfaces for GDS Theneighborhood routines are what make theproject useful in quantitative geographyand planning Neighborhood searches areextremely expensive without the topologi-cal data structure but they are usually amost important part of urban and environ-mental analyses once a general overview isobtained from the data

Although basic research and applicationdevelopment are two sides of one coin andmust go together Ito obtain lasting resultsthis paper has concentrated on the theo-retical parts of the project since their de-velopment is ahead of the application rou-tines a fact which should be expected

67

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

REGULAR GRID

~GENERALIZATION

TRIANGULATEDIRREGULAR GRID

First Data Structure

jNETWORK PROCEDURES

THEORy OFSURFACE SPECIFICPOINTS AND LINES

HIURfSTIC SEARCH

SEARCH

Profile SearchGeographic DisorientationOptimum SearchNeighborhood An~lysis

ContouringBlock DiagramShaded ContoursInclined ContoursVisibility TestRadar Image

Fig 9 The problem flow for GDS from the data base via the extraction of points andchains for the two types of data structures to their application

The quintessence of the research so far isthe hypothesis that topologically-structureddata bases of three-dimensional and planarsurfaces can result in reduced efforts inthe development and execution of applka-

REFERENCESAmidon E L and G S Aiken (1970) Al-

gorithmic Selection of the Best Method forCompressing Map Data Strings Communica-tions of the Association for Computillg Ma-chinery Vol 14 pp 769-774

68

tion routines We have some indicationthat the hypothesis is correct the real testwill come when the bulk of the applicationroutines is completed

Arms A (1970) MAPMODEL System Sys-tem Descriptioll and Users Guide Bureau ofGovernmental Research and Service Univer-sity of Oregon 59 pp

Boehm B M (1967) Tabular Representation

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of Multivariate Functions with Applicationsto Topographic Modelling Proceedings Z2ndNational Conference Association for Comput-ing Machinery pp 403-415

Brandstaetter L (1957) Exakte Schichtlinienund Topographische GelaendedarstellungOesterreichische Zeitschrift fuer VermesSlmgs-wesepound Sonderheft 18 Wien 90 pp

Cochrane Douglas (1974) Cartographic Dis-play Operations on Surface Data Collected inan Irregular Structure MA thesis SimonFraser University

Cooke D F and W F Maxfield (1967) TheDevelopment of a Geographic Base File andIts Uses for Mapping Urban and RegionalInformation Systems for Social Programs Pa-pers from the 5th Annual Conference of theUrban and Regional Information System As-sociation pp 207-218

Douglas David H (1973) BNDRYNETPeucker T K (ed) The Interactive Map inUrban Research Final Report After Year OneUniversity of British Columbia

Dueppe R D and H J Gottschalk (1970)Automatische Interpolation von Isolinien beiwillkuerlich verteilten Stuetzpunkten Allge-meine Vermessungs-Nachrichten Vol 129 No10 (Oct) pp 423-426

Grist M W (1972) Digital Ground ModelsAn Account of Recent Research Photogram-metric Record Vol 70 No4 (Oct) pp 424-441

Heiskanen W A and H Moritz (1967) Phys-ical Geodesy W H Freeman San FranciscoChapter 7

Holroyd M T and B K Bhattacharyya(1970) Automatic CO1poundtouring of GeoPhysicalData Using Riclbic Spline Interpolation Geo-logical Survey of Canada Paper No 70-55

Hsu M L et al (1975) Computer Applica-tions in Land Use Mapping and the MinnesotaLand Management Information System Da-vis ] C and M McCullagh (eds) Displayand Analysis of Spatial Data New York JohnWiley and Sons

Junkins ] L G M Miller and J R Jancaitis(1973) A Weighting Function Approach toModelling of Irregular Surfaces Journal ofGeophysical Research Vol 78 No 11 (Apr)pp 1794-1803

Kraus K (1973) A General Digital TerrainModel translation from an article in Acker-mann F (1973) Numerische Photogram-metrie Sammlung Wiechmann Neue FolgeBuchreihe Bd 5 Karlsruhe

Laboratory for Computer Graphics and Spatial

Vol 2 No1 April 1975

Analysis Harvard University (1974) POLY-VRT Manual Cambridge Mass

Mark David M (1974) A Comparison ofComputer-based Terrain Storage MethodsWith Respect to Evaluation of Certain Geo-morphometric Measures MA thesis Uni-versity of British Columbia

Morse S P (1968) A Mathematical Modelfor the Analysis of Contouring Line DataJournal of the Association for Computing Ma-chinery Vol 15 No2 pp 205-220

Nake F and T K Peucker (1972) The Inter-active Map in Urban Research Report afterYear One University of British Columbia

Nordbeck S and B Rystedt (1970) IsarithmicMaps and the Continuity of Reference IntervalFunctions Geografiska Annaler Vol 52 SerB pp 92-123

Peucker T K (1972) Computer CartographyAssociation of American Geographers CollegeGeography Commission Resource Paper No17 Washington DC

Peucker T K (ed) (1973) The InteractiveMap in Urban Research Final Report AfterYear One University of British Columbia

Peucker T K (1974) GeograPhical DataStructltres Report After Year One SimonFraser University

Pfaltz J L (1975) Surface Networks Geo-graPhical Analysis (forthcoming)

Rhynsburger D (1973) Analytic Delineationof Thiessen Polygons Geographical AnalysisVol 5 No2 (Apr) pp 133-144

Rosenfeld A (1969) Picture Processing byComputer New York Academic Press

Schmidt W (1969) The Automap SystemSurveying and Mapping Vol 29 No 1(Mar) pp 101-106

Shepard D (1968) A Two-Dimensional Inter-polation Function for Irregularly SpacedData Proceedings 23rd National ConferenceAssociation for Computing Machinery pp 517-524

Sima J (1972) Prinzipien des CS digitalenGelaendemodells Vermessungstechnik Vol20 No2 pp 48-51

Switzer P (1975) in Sampling of PlanarSurfaces Davis J C and M McCullagh(eds) DisPlay and Analysis of Spatial DataJohn Wiley and Sons New York

Tobler W R (1969) Geographical Filters andtheir Inverses GeograPhical Analysis Vol 1pp 234-253

Warntz W (1966) The Topology of Socio-Economic Terrain and Spatial Flows Papersof the Regional Science Association Vol 17pp47-61 bull

69

Page 11: +>KMIBK>JCD?,>M> :MKN?MNKALkclarke/Geography232/PeuckerChrisman.pdfDelivered by Publishing Technology to: University of California Santa Barbara Library IP: 128.111.234.232 on: Tue,

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

34 16 3S 53 52 33 15

11 4 18 36 35 16 38

CHAINS

POINTE~

NODES

Fig 8 The GDS-second data structureBoth the node-file and the chain-file haveaccess to the first data structure the node-file directly and the chain-file through achain-painter-file

Gottschalk 1970) creates all possible linkschooses the shortest and eliminates alllinks which intersect the shortest Thisprocedure is repeated with the next short-est links until no links intersect The re-sult is the set of links with the minimumcumulative distance between neighboringpoints

The procedure has one disadvantagesince (~ ) links have to be created the num-ber of points is therefore limited toonly several hundred The first step inour approach therefore limits the links toa number of potential neighbors amongwhich the shortest link is chosen and in-tersected with all other links originatingin these potential neighbors This pro-cedure limits the number of tests for in-

nm2

tersections of links to less than -4-where m is the number of potential neigh-bors an arbitrary number between 8 and14 depending on the density variation ofthese points The procedure does notguarantee however that only triangles areconstructed polygons with more thanthree sides can result although they arerelatively rare The check for such poly-gons and their elimination is very easy andfast

The second possibility is to create atriangulated structure through use ofThiessen polygons A published solution(Rhynsburger 1973) intersects for everypoint the links to every other point midwayand chooses the smallest polygon created

~COIDJ-OIIIJI II POINTS POINTERS CHAIN POINTERSL Nm bull

[[]]-DI ~~u =

317 35 34 1516

Points Pcinters

POINTS POINTERSName x y bull Pointers

r--1735

~ i34

6 X Y z -- 15 -~ 17 X Y z i+6 23

18 X Y z i+13

~

-bull183635- 16

3 X Y z j 3

~

~l

----

NEIGHBORHOOD RELATIONSHIPS

TRIANGULATED POINTS

Fig 7 The GDS-first data structure Theillustration shows the points on the surfacewith their links to neighbors (edges) Theexternal representation is shown by theneighborhood relationships The internalrepresentation is composed of the point-fileand the pointer-file

two types (a) sets of irregularly-distri-buted points which were digitized with theunderstanding that every point is signif-icant and (b) sets of regularly or irregu-larly-distributed points where it is knownthat a number of points are redundant andcan be eliminated from the set eg regulargrids of points and encoded contours

The first type of data set is linked bysome type of triangulation At least twoapproaches exist The first (Dueppe and

Vol 2 No1 April 1975 65

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

by the perpendiculars Every point whichcontributes to the Thiessen polygon is aThiessen neighbor This procedure canagain be simplified by the assumption of alimited set of potential neighbors Thesame checking routines as above have tobe applied

Another approach which limits the num-ber of necessary tests but is mathemat-ically correct at the same time has beendeveloped within the project by KurtBrassel at Harvard University The pro-cedure is based on fields of potentialneighbors which converge very rapidly

The alternative to triangulation is three-dimensional generalization ie to selectfrom a set of points those which define thestructure with the least deviation from theoriginal surface The basic concept de-veloped by Randolph Franklin of HarvardUniversity and T K Peucker is to ap-proximate the surface of a series of tri-angles through a selected set of pointswhere each additional point included intothe set is the one which deviates the mostfrom the approximated triangles until thedeviations are below a given value (seePeucker 1974)

In both cases the triangulation and gen-eralization of the surface the result is alinked list of surface points The termlinked list means that points are linked withone another through pointers In otherwords a point is not only identified by its y z coordinates but also by a list of thelabels of the points which form edges oftriangles with the point In our case eachrecord consists of the t y z coordinates ofa point and a reference to the start of thepointers to the neighbors in a pointer listThe reason for not having the neighbor-hood pointers with the point record is thatthe number of points varies considerably(Mark 1974) Since the record has to belong enough to include all possible num-bers of neighbors large parts of the pointersections would be empty most of the timeThe pointers are sorted starting with thepointer the least East of North of a point(Fig 7)

The use of this type of data-structure isvery simple and efficient For every search

66

(profile contour etc) a criterion for edge-intersection is developed For contouringfor example the critical question iswhether one point of the edge is above thecontour level and the other below A startis found and one point of the edge is con-sidered a reference point and the other asubpoint The next subpoint is found bylooking up the next neighbor in the pointerlist If the test is positive the intersectionis performed and the process repeated Ifthe test is negative the reference and sub-points are switched and the process re-peated

Other procedures are equally simpleTo find a triangle for example one hasonly to have a reference and a subpointThe third point is the next label in thepointer list of the reference point after thesubpoint To find all triangles one goesthrough the total pointer list leaving outall those edges connecting reference pointswith subpoints with a smaller label sincethey would create triangles which hadbeen treated when the subpoint was areference point

Although the first data structure aspresented above seems to be efficient interms of storage capacity it does not pro-vide easy access to the data base which isoften very large I t is for this reasonthat we are developing the second datastructure to represent the general structureof the surface and to serve as a directoryto reach into the first data base (Pfaltz1975) (Fig 8)

The first step in the creation of thesecond data structure is to find the ridgechannel and break lines on the surface Ifone labels the highest point for every tri-angle the unlabeled points are membersof the channel line at least on smoothsurfaces A subsequent search routinedeals with the irregularities Points alongridges are found by eliminating the lowestpoint of every triangle using a routinedeveloped for a regular grid by D MDouglas The detection of break lines issomewhat more difficult

Some theoretical studies of surfaces byWarntz (1966) show that ridge lines andchannel lines cross at passes a useful point

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of information relative to the developmentof the second data structure Practicalconsiderations suggest that in terrain andother surfaces this regularity is not alwayspresent

Once the topological structure is at handit can be used as a directory into the firststructure The line is therefore treatedas a chain similar to the chain of thePOLYVRT system The nodes of thechain are the peaks passes pits and otherendpoints of chains on the surfaces Thesepoints are stored with their coordinatesand the names of the chains which termi-nate at the nodes The chains are storedwith the labels of the nodes and pointersinto the chain lists which consist of labelsof points in the first data structure (notethat here the chain structure differs fromthat of POLYVRT)

A third component of the GeographicData Structure should be mentioned sinceit illustrates very well the logical adapta-tion of a computer problem solution togeographical data The problem at handis the partitioning of the data set Sincewith large data sets only portions can bekept in fast memory the data base is seg-mented into pages which are broughtinto memory as units For the Geo-graphic Data Structure the paging sys-tem can solve several problems inherentin a complex geographic information sys-tem

The boundaries of patches as we callthe areal extension of a page are chainsalready defined for the second structureSince detail along the chain is of no topo-logical interest the density of points alongthe chain can differ for its two sides Inother words the density of triangles canchange from patch to patch This allowsfor very efficient data encoding even interrain with sudden changes in the sur-face behavior as at the change from amountainous area into a plain (Peucker1972)

Another advantage of the paging-systemis the ease of including topographic andplanar information Linking point lineand areal data to the triangulated pointswould lead to high definitional redundancy

Vol 2 No1 APril 1975

The secondary structure could lead lttoam-biguities where the terrain is very elon-gated Since an attempt has been made tokeep the shape of the patches as compactas possible the combination of non-terraindata with patch boundaries seems to bemost appropriate

Since the patch boundaries are againchains another virtue comes to light Thepatches can be treated as polygons of thePOLYVRT system with little difficultyThis link between the two systems lendshope that eventually they may be merged

It is an appropriate question to ask whatsuch a data structure as the GDS will beable to accomplish A number of displayroutines have already been developed(Cochrane 1974) and a series of proce-dures for surface analysis based on heuris-tic searches are underway (Fig 9) Sinceboth levels of data structure are graphswe will be able to rely on many of the de-velopments connected with operations re-search specifically network analysis forthe manipulative treatment of the data

As both systems GDS and GEOGRAFhave topological structures it is possibleto merge the two The creation of poly-gons from points is the major link fromthe GDS project to GEOGRAF Thecreation of a set of centroids for polygonsallows the conversion in the opposite direc-tion This way surfaces can be treated aspolygonal sets and can be displayed andmanipulated by the routines of GEO-GRAF Conversely polygonal data canbe treated as surfaces for GDS Theneighborhood routines are what make theproject useful in quantitative geographyand planning Neighborhood searches areextremely expensive without the topologi-cal data structure but they are usually amost important part of urban and environ-mental analyses once a general overview isobtained from the data

Although basic research and applicationdevelopment are two sides of one coin andmust go together Ito obtain lasting resultsthis paper has concentrated on the theo-retical parts of the project since their de-velopment is ahead of the application rou-tines a fact which should be expected

67

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

REGULAR GRID

~GENERALIZATION

TRIANGULATEDIRREGULAR GRID

First Data Structure

jNETWORK PROCEDURES

THEORy OFSURFACE SPECIFICPOINTS AND LINES

HIURfSTIC SEARCH

SEARCH

Profile SearchGeographic DisorientationOptimum SearchNeighborhood An~lysis

ContouringBlock DiagramShaded ContoursInclined ContoursVisibility TestRadar Image

Fig 9 The problem flow for GDS from the data base via the extraction of points andchains for the two types of data structures to their application

The quintessence of the research so far isthe hypothesis that topologically-structureddata bases of three-dimensional and planarsurfaces can result in reduced efforts inthe development and execution of applka-

REFERENCESAmidon E L and G S Aiken (1970) Al-

gorithmic Selection of the Best Method forCompressing Map Data Strings Communica-tions of the Association for Computillg Ma-chinery Vol 14 pp 769-774

68

tion routines We have some indicationthat the hypothesis is correct the real testwill come when the bulk of the applicationroutines is completed

Arms A (1970) MAPMODEL System Sys-tem Descriptioll and Users Guide Bureau ofGovernmental Research and Service Univer-sity of Oregon 59 pp

Boehm B M (1967) Tabular Representation

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of Multivariate Functions with Applicationsto Topographic Modelling Proceedings Z2ndNational Conference Association for Comput-ing Machinery pp 403-415

Brandstaetter L (1957) Exakte Schichtlinienund Topographische GelaendedarstellungOesterreichische Zeitschrift fuer VermesSlmgs-wesepound Sonderheft 18 Wien 90 pp

Cochrane Douglas (1974) Cartographic Dis-play Operations on Surface Data Collected inan Irregular Structure MA thesis SimonFraser University

Cooke D F and W F Maxfield (1967) TheDevelopment of a Geographic Base File andIts Uses for Mapping Urban and RegionalInformation Systems for Social Programs Pa-pers from the 5th Annual Conference of theUrban and Regional Information System As-sociation pp 207-218

Douglas David H (1973) BNDRYNETPeucker T K (ed) The Interactive Map inUrban Research Final Report After Year OneUniversity of British Columbia

Dueppe R D and H J Gottschalk (1970)Automatische Interpolation von Isolinien beiwillkuerlich verteilten Stuetzpunkten Allge-meine Vermessungs-Nachrichten Vol 129 No10 (Oct) pp 423-426

Grist M W (1972) Digital Ground ModelsAn Account of Recent Research Photogram-metric Record Vol 70 No4 (Oct) pp 424-441

Heiskanen W A and H Moritz (1967) Phys-ical Geodesy W H Freeman San FranciscoChapter 7

Holroyd M T and B K Bhattacharyya(1970) Automatic CO1poundtouring of GeoPhysicalData Using Riclbic Spline Interpolation Geo-logical Survey of Canada Paper No 70-55

Hsu M L et al (1975) Computer Applica-tions in Land Use Mapping and the MinnesotaLand Management Information System Da-vis ] C and M McCullagh (eds) Displayand Analysis of Spatial Data New York JohnWiley and Sons

Junkins ] L G M Miller and J R Jancaitis(1973) A Weighting Function Approach toModelling of Irregular Surfaces Journal ofGeophysical Research Vol 78 No 11 (Apr)pp 1794-1803

Kraus K (1973) A General Digital TerrainModel translation from an article in Acker-mann F (1973) Numerische Photogram-metrie Sammlung Wiechmann Neue FolgeBuchreihe Bd 5 Karlsruhe

Laboratory for Computer Graphics and Spatial

Vol 2 No1 April 1975

Analysis Harvard University (1974) POLY-VRT Manual Cambridge Mass

Mark David M (1974) A Comparison ofComputer-based Terrain Storage MethodsWith Respect to Evaluation of Certain Geo-morphometric Measures MA thesis Uni-versity of British Columbia

Morse S P (1968) A Mathematical Modelfor the Analysis of Contouring Line DataJournal of the Association for Computing Ma-chinery Vol 15 No2 pp 205-220

Nake F and T K Peucker (1972) The Inter-active Map in Urban Research Report afterYear One University of British Columbia

Nordbeck S and B Rystedt (1970) IsarithmicMaps and the Continuity of Reference IntervalFunctions Geografiska Annaler Vol 52 SerB pp 92-123

Peucker T K (1972) Computer CartographyAssociation of American Geographers CollegeGeography Commission Resource Paper No17 Washington DC

Peucker T K (ed) (1973) The InteractiveMap in Urban Research Final Report AfterYear One University of British Columbia

Peucker T K (1974) GeograPhical DataStructltres Report After Year One SimonFraser University

Pfaltz J L (1975) Surface Networks Geo-graPhical Analysis (forthcoming)

Rhynsburger D (1973) Analytic Delineationof Thiessen Polygons Geographical AnalysisVol 5 No2 (Apr) pp 133-144

Rosenfeld A (1969) Picture Processing byComputer New York Academic Press

Schmidt W (1969) The Automap SystemSurveying and Mapping Vol 29 No 1(Mar) pp 101-106

Shepard D (1968) A Two-Dimensional Inter-polation Function for Irregularly SpacedData Proceedings 23rd National ConferenceAssociation for Computing Machinery pp 517-524

Sima J (1972) Prinzipien des CS digitalenGelaendemodells Vermessungstechnik Vol20 No2 pp 48-51

Switzer P (1975) in Sampling of PlanarSurfaces Davis J C and M McCullagh(eds) DisPlay and Analysis of Spatial DataJohn Wiley and Sons New York

Tobler W R (1969) Geographical Filters andtheir Inverses GeograPhical Analysis Vol 1pp 234-253

Warntz W (1966) The Topology of Socio-Economic Terrain and Spatial Flows Papersof the Regional Science Association Vol 17pp47-61 bull

69

Page 12: +>KMIBK>JCD?,>M> :MKN?MNKALkclarke/Geography232/PeuckerChrisman.pdfDelivered by Publishing Technology to: University of California Santa Barbara Library IP: 128.111.234.232 on: Tue,

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

by the perpendiculars Every point whichcontributes to the Thiessen polygon is aThiessen neighbor This procedure canagain be simplified by the assumption of alimited set of potential neighbors Thesame checking routines as above have tobe applied

Another approach which limits the num-ber of necessary tests but is mathemat-ically correct at the same time has beendeveloped within the project by KurtBrassel at Harvard University The pro-cedure is based on fields of potentialneighbors which converge very rapidly

The alternative to triangulation is three-dimensional generalization ie to selectfrom a set of points those which define thestructure with the least deviation from theoriginal surface The basic concept de-veloped by Randolph Franklin of HarvardUniversity and T K Peucker is to ap-proximate the surface of a series of tri-angles through a selected set of pointswhere each additional point included intothe set is the one which deviates the mostfrom the approximated triangles until thedeviations are below a given value (seePeucker 1974)

In both cases the triangulation and gen-eralization of the surface the result is alinked list of surface points The termlinked list means that points are linked withone another through pointers In otherwords a point is not only identified by its y z coordinates but also by a list of thelabels of the points which form edges oftriangles with the point In our case eachrecord consists of the t y z coordinates ofa point and a reference to the start of thepointers to the neighbors in a pointer listThe reason for not having the neighbor-hood pointers with the point record is thatthe number of points varies considerably(Mark 1974) Since the record has to belong enough to include all possible num-bers of neighbors large parts of the pointersections would be empty most of the timeThe pointers are sorted starting with thepointer the least East of North of a point(Fig 7)

The use of this type of data-structure isvery simple and efficient For every search

66

(profile contour etc) a criterion for edge-intersection is developed For contouringfor example the critical question iswhether one point of the edge is above thecontour level and the other below A startis found and one point of the edge is con-sidered a reference point and the other asubpoint The next subpoint is found bylooking up the next neighbor in the pointerlist If the test is positive the intersectionis performed and the process repeated Ifthe test is negative the reference and sub-points are switched and the process re-peated

Other procedures are equally simpleTo find a triangle for example one hasonly to have a reference and a subpointThe third point is the next label in thepointer list of the reference point after thesubpoint To find all triangles one goesthrough the total pointer list leaving outall those edges connecting reference pointswith subpoints with a smaller label sincethey would create triangles which hadbeen treated when the subpoint was areference point

Although the first data structure aspresented above seems to be efficient interms of storage capacity it does not pro-vide easy access to the data base which isoften very large I t is for this reasonthat we are developing the second datastructure to represent the general structureof the surface and to serve as a directoryto reach into the first data base (Pfaltz1975) (Fig 8)

The first step in the creation of thesecond data structure is to find the ridgechannel and break lines on the surface Ifone labels the highest point for every tri-angle the unlabeled points are membersof the channel line at least on smoothsurfaces A subsequent search routinedeals with the irregularities Points alongridges are found by eliminating the lowestpoint of every triangle using a routinedeveloped for a regular grid by D MDouglas The detection of break lines issomewhat more difficult

Some theoretical studies of surfaces byWarntz (1966) show that ridge lines andchannel lines cross at passes a useful point

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of information relative to the developmentof the second data structure Practicalconsiderations suggest that in terrain andother surfaces this regularity is not alwayspresent

Once the topological structure is at handit can be used as a directory into the firststructure The line is therefore treatedas a chain similar to the chain of thePOLYVRT system The nodes of thechain are the peaks passes pits and otherendpoints of chains on the surfaces Thesepoints are stored with their coordinatesand the names of the chains which termi-nate at the nodes The chains are storedwith the labels of the nodes and pointersinto the chain lists which consist of labelsof points in the first data structure (notethat here the chain structure differs fromthat of POLYVRT)

A third component of the GeographicData Structure should be mentioned sinceit illustrates very well the logical adapta-tion of a computer problem solution togeographical data The problem at handis the partitioning of the data set Sincewith large data sets only portions can bekept in fast memory the data base is seg-mented into pages which are broughtinto memory as units For the Geo-graphic Data Structure the paging sys-tem can solve several problems inherentin a complex geographic information sys-tem

The boundaries of patches as we callthe areal extension of a page are chainsalready defined for the second structureSince detail along the chain is of no topo-logical interest the density of points alongthe chain can differ for its two sides Inother words the density of triangles canchange from patch to patch This allowsfor very efficient data encoding even interrain with sudden changes in the sur-face behavior as at the change from amountainous area into a plain (Peucker1972)

Another advantage of the paging-systemis the ease of including topographic andplanar information Linking point lineand areal data to the triangulated pointswould lead to high definitional redundancy

Vol 2 No1 APril 1975

The secondary structure could lead lttoam-biguities where the terrain is very elon-gated Since an attempt has been made tokeep the shape of the patches as compactas possible the combination of non-terraindata with patch boundaries seems to bemost appropriate

Since the patch boundaries are againchains another virtue comes to light Thepatches can be treated as polygons of thePOLYVRT system with little difficultyThis link between the two systems lendshope that eventually they may be merged

It is an appropriate question to ask whatsuch a data structure as the GDS will beable to accomplish A number of displayroutines have already been developed(Cochrane 1974) and a series of proce-dures for surface analysis based on heuris-tic searches are underway (Fig 9) Sinceboth levels of data structure are graphswe will be able to rely on many of the de-velopments connected with operations re-search specifically network analysis forthe manipulative treatment of the data

As both systems GDS and GEOGRAFhave topological structures it is possibleto merge the two The creation of poly-gons from points is the major link fromthe GDS project to GEOGRAF Thecreation of a set of centroids for polygonsallows the conversion in the opposite direc-tion This way surfaces can be treated aspolygonal sets and can be displayed andmanipulated by the routines of GEO-GRAF Conversely polygonal data canbe treated as surfaces for GDS Theneighborhood routines are what make theproject useful in quantitative geographyand planning Neighborhood searches areextremely expensive without the topologi-cal data structure but they are usually amost important part of urban and environ-mental analyses once a general overview isobtained from the data

Although basic research and applicationdevelopment are two sides of one coin andmust go together Ito obtain lasting resultsthis paper has concentrated on the theo-retical parts of the project since their de-velopment is ahead of the application rou-tines a fact which should be expected

67

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

REGULAR GRID

~GENERALIZATION

TRIANGULATEDIRREGULAR GRID

First Data Structure

jNETWORK PROCEDURES

THEORy OFSURFACE SPECIFICPOINTS AND LINES

HIURfSTIC SEARCH

SEARCH

Profile SearchGeographic DisorientationOptimum SearchNeighborhood An~lysis

ContouringBlock DiagramShaded ContoursInclined ContoursVisibility TestRadar Image

Fig 9 The problem flow for GDS from the data base via the extraction of points andchains for the two types of data structures to their application

The quintessence of the research so far isthe hypothesis that topologically-structureddata bases of three-dimensional and planarsurfaces can result in reduced efforts inthe development and execution of applka-

REFERENCESAmidon E L and G S Aiken (1970) Al-

gorithmic Selection of the Best Method forCompressing Map Data Strings Communica-tions of the Association for Computillg Ma-chinery Vol 14 pp 769-774

68

tion routines We have some indicationthat the hypothesis is correct the real testwill come when the bulk of the applicationroutines is completed

Arms A (1970) MAPMODEL System Sys-tem Descriptioll and Users Guide Bureau ofGovernmental Research and Service Univer-sity of Oregon 59 pp

Boehm B M (1967) Tabular Representation

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of Multivariate Functions with Applicationsto Topographic Modelling Proceedings Z2ndNational Conference Association for Comput-ing Machinery pp 403-415

Brandstaetter L (1957) Exakte Schichtlinienund Topographische GelaendedarstellungOesterreichische Zeitschrift fuer VermesSlmgs-wesepound Sonderheft 18 Wien 90 pp

Cochrane Douglas (1974) Cartographic Dis-play Operations on Surface Data Collected inan Irregular Structure MA thesis SimonFraser University

Cooke D F and W F Maxfield (1967) TheDevelopment of a Geographic Base File andIts Uses for Mapping Urban and RegionalInformation Systems for Social Programs Pa-pers from the 5th Annual Conference of theUrban and Regional Information System As-sociation pp 207-218

Douglas David H (1973) BNDRYNETPeucker T K (ed) The Interactive Map inUrban Research Final Report After Year OneUniversity of British Columbia

Dueppe R D and H J Gottschalk (1970)Automatische Interpolation von Isolinien beiwillkuerlich verteilten Stuetzpunkten Allge-meine Vermessungs-Nachrichten Vol 129 No10 (Oct) pp 423-426

Grist M W (1972) Digital Ground ModelsAn Account of Recent Research Photogram-metric Record Vol 70 No4 (Oct) pp 424-441

Heiskanen W A and H Moritz (1967) Phys-ical Geodesy W H Freeman San FranciscoChapter 7

Holroyd M T and B K Bhattacharyya(1970) Automatic CO1poundtouring of GeoPhysicalData Using Riclbic Spline Interpolation Geo-logical Survey of Canada Paper No 70-55

Hsu M L et al (1975) Computer Applica-tions in Land Use Mapping and the MinnesotaLand Management Information System Da-vis ] C and M McCullagh (eds) Displayand Analysis of Spatial Data New York JohnWiley and Sons

Junkins ] L G M Miller and J R Jancaitis(1973) A Weighting Function Approach toModelling of Irregular Surfaces Journal ofGeophysical Research Vol 78 No 11 (Apr)pp 1794-1803

Kraus K (1973) A General Digital TerrainModel translation from an article in Acker-mann F (1973) Numerische Photogram-metrie Sammlung Wiechmann Neue FolgeBuchreihe Bd 5 Karlsruhe

Laboratory for Computer Graphics and Spatial

Vol 2 No1 April 1975

Analysis Harvard University (1974) POLY-VRT Manual Cambridge Mass

Mark David M (1974) A Comparison ofComputer-based Terrain Storage MethodsWith Respect to Evaluation of Certain Geo-morphometric Measures MA thesis Uni-versity of British Columbia

Morse S P (1968) A Mathematical Modelfor the Analysis of Contouring Line DataJournal of the Association for Computing Ma-chinery Vol 15 No2 pp 205-220

Nake F and T K Peucker (1972) The Inter-active Map in Urban Research Report afterYear One University of British Columbia

Nordbeck S and B Rystedt (1970) IsarithmicMaps and the Continuity of Reference IntervalFunctions Geografiska Annaler Vol 52 SerB pp 92-123

Peucker T K (1972) Computer CartographyAssociation of American Geographers CollegeGeography Commission Resource Paper No17 Washington DC

Peucker T K (ed) (1973) The InteractiveMap in Urban Research Final Report AfterYear One University of British Columbia

Peucker T K (1974) GeograPhical DataStructltres Report After Year One SimonFraser University

Pfaltz J L (1975) Surface Networks Geo-graPhical Analysis (forthcoming)

Rhynsburger D (1973) Analytic Delineationof Thiessen Polygons Geographical AnalysisVol 5 No2 (Apr) pp 133-144

Rosenfeld A (1969) Picture Processing byComputer New York Academic Press

Schmidt W (1969) The Automap SystemSurveying and Mapping Vol 29 No 1(Mar) pp 101-106

Shepard D (1968) A Two-Dimensional Inter-polation Function for Irregularly SpacedData Proceedings 23rd National ConferenceAssociation for Computing Machinery pp 517-524

Sima J (1972) Prinzipien des CS digitalenGelaendemodells Vermessungstechnik Vol20 No2 pp 48-51

Switzer P (1975) in Sampling of PlanarSurfaces Davis J C and M McCullagh(eds) DisPlay and Analysis of Spatial DataJohn Wiley and Sons New York

Tobler W R (1969) Geographical Filters andtheir Inverses GeograPhical Analysis Vol 1pp 234-253

Warntz W (1966) The Topology of Socio-Economic Terrain and Spatial Flows Papersof the Regional Science Association Vol 17pp47-61 bull

69

Page 13: +>KMIBK>JCD?,>M> :MKN?MNKALkclarke/Geography232/PeuckerChrisman.pdfDelivered by Publishing Technology to: University of California Santa Barbara Library IP: 128.111.234.232 on: Tue,

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of information relative to the developmentof the second data structure Practicalconsiderations suggest that in terrain andother surfaces this regularity is not alwayspresent

Once the topological structure is at handit can be used as a directory into the firststructure The line is therefore treatedas a chain similar to the chain of thePOLYVRT system The nodes of thechain are the peaks passes pits and otherendpoints of chains on the surfaces Thesepoints are stored with their coordinatesand the names of the chains which termi-nate at the nodes The chains are storedwith the labels of the nodes and pointersinto the chain lists which consist of labelsof points in the first data structure (notethat here the chain structure differs fromthat of POLYVRT)

A third component of the GeographicData Structure should be mentioned sinceit illustrates very well the logical adapta-tion of a computer problem solution togeographical data The problem at handis the partitioning of the data set Sincewith large data sets only portions can bekept in fast memory the data base is seg-mented into pages which are broughtinto memory as units For the Geo-graphic Data Structure the paging sys-tem can solve several problems inherentin a complex geographic information sys-tem

The boundaries of patches as we callthe areal extension of a page are chainsalready defined for the second structureSince detail along the chain is of no topo-logical interest the density of points alongthe chain can differ for its two sides Inother words the density of triangles canchange from patch to patch This allowsfor very efficient data encoding even interrain with sudden changes in the sur-face behavior as at the change from amountainous area into a plain (Peucker1972)

Another advantage of the paging-systemis the ease of including topographic andplanar information Linking point lineand areal data to the triangulated pointswould lead to high definitional redundancy

Vol 2 No1 APril 1975

The secondary structure could lead lttoam-biguities where the terrain is very elon-gated Since an attempt has been made tokeep the shape of the patches as compactas possible the combination of non-terraindata with patch boundaries seems to bemost appropriate

Since the patch boundaries are againchains another virtue comes to light Thepatches can be treated as polygons of thePOLYVRT system with little difficultyThis link between the two systems lendshope that eventually they may be merged

It is an appropriate question to ask whatsuch a data structure as the GDS will beable to accomplish A number of displayroutines have already been developed(Cochrane 1974) and a series of proce-dures for surface analysis based on heuris-tic searches are underway (Fig 9) Sinceboth levels of data structure are graphswe will be able to rely on many of the de-velopments connected with operations re-search specifically network analysis forthe manipulative treatment of the data

As both systems GDS and GEOGRAFhave topological structures it is possibleto merge the two The creation of poly-gons from points is the major link fromthe GDS project to GEOGRAF Thecreation of a set of centroids for polygonsallows the conversion in the opposite direc-tion This way surfaces can be treated aspolygonal sets and can be displayed andmanipulated by the routines of GEO-GRAF Conversely polygonal data canbe treated as surfaces for GDS Theneighborhood routines are what make theproject useful in quantitative geographyand planning Neighborhood searches areextremely expensive without the topologi-cal data structure but they are usually amost important part of urban and environ-mental analyses once a general overview isobtained from the data

Although basic research and applicationdevelopment are two sides of one coin andmust go together Ito obtain lasting resultsthis paper has concentrated on the theo-retical parts of the project since their de-velopment is ahead of the application rou-tines a fact which should be expected

67

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

REGULAR GRID

~GENERALIZATION

TRIANGULATEDIRREGULAR GRID

First Data Structure

jNETWORK PROCEDURES

THEORy OFSURFACE SPECIFICPOINTS AND LINES

HIURfSTIC SEARCH

SEARCH

Profile SearchGeographic DisorientationOptimum SearchNeighborhood An~lysis

ContouringBlock DiagramShaded ContoursInclined ContoursVisibility TestRadar Image

Fig 9 The problem flow for GDS from the data base via the extraction of points andchains for the two types of data structures to their application

The quintessence of the research so far isthe hypothesis that topologically-structureddata bases of three-dimensional and planarsurfaces can result in reduced efforts inthe development and execution of applka-

REFERENCESAmidon E L and G S Aiken (1970) Al-

gorithmic Selection of the Best Method forCompressing Map Data Strings Communica-tions of the Association for Computillg Ma-chinery Vol 14 pp 769-774

68

tion routines We have some indicationthat the hypothesis is correct the real testwill come when the bulk of the applicationroutines is completed

Arms A (1970) MAPMODEL System Sys-tem Descriptioll and Users Guide Bureau ofGovernmental Research and Service Univer-sity of Oregon 59 pp

Boehm B M (1967) Tabular Representation

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of Multivariate Functions with Applicationsto Topographic Modelling Proceedings Z2ndNational Conference Association for Comput-ing Machinery pp 403-415

Brandstaetter L (1957) Exakte Schichtlinienund Topographische GelaendedarstellungOesterreichische Zeitschrift fuer VermesSlmgs-wesepound Sonderheft 18 Wien 90 pp

Cochrane Douglas (1974) Cartographic Dis-play Operations on Surface Data Collected inan Irregular Structure MA thesis SimonFraser University

Cooke D F and W F Maxfield (1967) TheDevelopment of a Geographic Base File andIts Uses for Mapping Urban and RegionalInformation Systems for Social Programs Pa-pers from the 5th Annual Conference of theUrban and Regional Information System As-sociation pp 207-218

Douglas David H (1973) BNDRYNETPeucker T K (ed) The Interactive Map inUrban Research Final Report After Year OneUniversity of British Columbia

Dueppe R D and H J Gottschalk (1970)Automatische Interpolation von Isolinien beiwillkuerlich verteilten Stuetzpunkten Allge-meine Vermessungs-Nachrichten Vol 129 No10 (Oct) pp 423-426

Grist M W (1972) Digital Ground ModelsAn Account of Recent Research Photogram-metric Record Vol 70 No4 (Oct) pp 424-441

Heiskanen W A and H Moritz (1967) Phys-ical Geodesy W H Freeman San FranciscoChapter 7

Holroyd M T and B K Bhattacharyya(1970) Automatic CO1poundtouring of GeoPhysicalData Using Riclbic Spline Interpolation Geo-logical Survey of Canada Paper No 70-55

Hsu M L et al (1975) Computer Applica-tions in Land Use Mapping and the MinnesotaLand Management Information System Da-vis ] C and M McCullagh (eds) Displayand Analysis of Spatial Data New York JohnWiley and Sons

Junkins ] L G M Miller and J R Jancaitis(1973) A Weighting Function Approach toModelling of Irregular Surfaces Journal ofGeophysical Research Vol 78 No 11 (Apr)pp 1794-1803

Kraus K (1973) A General Digital TerrainModel translation from an article in Acker-mann F (1973) Numerische Photogram-metrie Sammlung Wiechmann Neue FolgeBuchreihe Bd 5 Karlsruhe

Laboratory for Computer Graphics and Spatial

Vol 2 No1 April 1975

Analysis Harvard University (1974) POLY-VRT Manual Cambridge Mass

Mark David M (1974) A Comparison ofComputer-based Terrain Storage MethodsWith Respect to Evaluation of Certain Geo-morphometric Measures MA thesis Uni-versity of British Columbia

Morse S P (1968) A Mathematical Modelfor the Analysis of Contouring Line DataJournal of the Association for Computing Ma-chinery Vol 15 No2 pp 205-220

Nake F and T K Peucker (1972) The Inter-active Map in Urban Research Report afterYear One University of British Columbia

Nordbeck S and B Rystedt (1970) IsarithmicMaps and the Continuity of Reference IntervalFunctions Geografiska Annaler Vol 52 SerB pp 92-123

Peucker T K (1972) Computer CartographyAssociation of American Geographers CollegeGeography Commission Resource Paper No17 Washington DC

Peucker T K (ed) (1973) The InteractiveMap in Urban Research Final Report AfterYear One University of British Columbia

Peucker T K (1974) GeograPhical DataStructltres Report After Year One SimonFraser University

Pfaltz J L (1975) Surface Networks Geo-graPhical Analysis (forthcoming)

Rhynsburger D (1973) Analytic Delineationof Thiessen Polygons Geographical AnalysisVol 5 No2 (Apr) pp 133-144

Rosenfeld A (1969) Picture Processing byComputer New York Academic Press

Schmidt W (1969) The Automap SystemSurveying and Mapping Vol 29 No 1(Mar) pp 101-106

Shepard D (1968) A Two-Dimensional Inter-polation Function for Irregularly SpacedData Proceedings 23rd National ConferenceAssociation for Computing Machinery pp 517-524

Sima J (1972) Prinzipien des CS digitalenGelaendemodells Vermessungstechnik Vol20 No2 pp 48-51

Switzer P (1975) in Sampling of PlanarSurfaces Davis J C and M McCullagh(eds) DisPlay and Analysis of Spatial DataJohn Wiley and Sons New York

Tobler W R (1969) Geographical Filters andtheir Inverses GeograPhical Analysis Vol 1pp 234-253

Warntz W (1966) The Topology of Socio-Economic Terrain and Spatial Flows Papersof the Regional Science Association Vol 17pp47-61 bull

69

Page 14: +>KMIBK>JCD?,>M> :MKN?MNKALkclarke/Geography232/PeuckerChrisman.pdfDelivered by Publishing Technology to: University of California Santa Barbara Library IP: 128.111.234.232 on: Tue,

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

REGULAR GRID

~GENERALIZATION

TRIANGULATEDIRREGULAR GRID

First Data Structure

jNETWORK PROCEDURES

THEORy OFSURFACE SPECIFICPOINTS AND LINES

HIURfSTIC SEARCH

SEARCH

Profile SearchGeographic DisorientationOptimum SearchNeighborhood An~lysis

ContouringBlock DiagramShaded ContoursInclined ContoursVisibility TestRadar Image

Fig 9 The problem flow for GDS from the data base via the extraction of points andchains for the two types of data structures to their application

The quintessence of the research so far isthe hypothesis that topologically-structureddata bases of three-dimensional and planarsurfaces can result in reduced efforts inthe development and execution of applka-

REFERENCESAmidon E L and G S Aiken (1970) Al-

gorithmic Selection of the Best Method forCompressing Map Data Strings Communica-tions of the Association for Computillg Ma-chinery Vol 14 pp 769-774

68

tion routines We have some indicationthat the hypothesis is correct the real testwill come when the bulk of the applicationroutines is completed

Arms A (1970) MAPMODEL System Sys-tem Descriptioll and Users Guide Bureau ofGovernmental Research and Service Univer-sity of Oregon 59 pp

Boehm B M (1967) Tabular Representation

The American Cartographer

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of Multivariate Functions with Applicationsto Topographic Modelling Proceedings Z2ndNational Conference Association for Comput-ing Machinery pp 403-415

Brandstaetter L (1957) Exakte Schichtlinienund Topographische GelaendedarstellungOesterreichische Zeitschrift fuer VermesSlmgs-wesepound Sonderheft 18 Wien 90 pp

Cochrane Douglas (1974) Cartographic Dis-play Operations on Surface Data Collected inan Irregular Structure MA thesis SimonFraser University

Cooke D F and W F Maxfield (1967) TheDevelopment of a Geographic Base File andIts Uses for Mapping Urban and RegionalInformation Systems for Social Programs Pa-pers from the 5th Annual Conference of theUrban and Regional Information System As-sociation pp 207-218

Douglas David H (1973) BNDRYNETPeucker T K (ed) The Interactive Map inUrban Research Final Report After Year OneUniversity of British Columbia

Dueppe R D and H J Gottschalk (1970)Automatische Interpolation von Isolinien beiwillkuerlich verteilten Stuetzpunkten Allge-meine Vermessungs-Nachrichten Vol 129 No10 (Oct) pp 423-426

Grist M W (1972) Digital Ground ModelsAn Account of Recent Research Photogram-metric Record Vol 70 No4 (Oct) pp 424-441

Heiskanen W A and H Moritz (1967) Phys-ical Geodesy W H Freeman San FranciscoChapter 7

Holroyd M T and B K Bhattacharyya(1970) Automatic CO1poundtouring of GeoPhysicalData Using Riclbic Spline Interpolation Geo-logical Survey of Canada Paper No 70-55

Hsu M L et al (1975) Computer Applica-tions in Land Use Mapping and the MinnesotaLand Management Information System Da-vis ] C and M McCullagh (eds) Displayand Analysis of Spatial Data New York JohnWiley and Sons

Junkins ] L G M Miller and J R Jancaitis(1973) A Weighting Function Approach toModelling of Irregular Surfaces Journal ofGeophysical Research Vol 78 No 11 (Apr)pp 1794-1803

Kraus K (1973) A General Digital TerrainModel translation from an article in Acker-mann F (1973) Numerische Photogram-metrie Sammlung Wiechmann Neue FolgeBuchreihe Bd 5 Karlsruhe

Laboratory for Computer Graphics and Spatial

Vol 2 No1 April 1975

Analysis Harvard University (1974) POLY-VRT Manual Cambridge Mass

Mark David M (1974) A Comparison ofComputer-based Terrain Storage MethodsWith Respect to Evaluation of Certain Geo-morphometric Measures MA thesis Uni-versity of British Columbia

Morse S P (1968) A Mathematical Modelfor the Analysis of Contouring Line DataJournal of the Association for Computing Ma-chinery Vol 15 No2 pp 205-220

Nake F and T K Peucker (1972) The Inter-active Map in Urban Research Report afterYear One University of British Columbia

Nordbeck S and B Rystedt (1970) IsarithmicMaps and the Continuity of Reference IntervalFunctions Geografiska Annaler Vol 52 SerB pp 92-123

Peucker T K (1972) Computer CartographyAssociation of American Geographers CollegeGeography Commission Resource Paper No17 Washington DC

Peucker T K (ed) (1973) The InteractiveMap in Urban Research Final Report AfterYear One University of British Columbia

Peucker T K (1974) GeograPhical DataStructltres Report After Year One SimonFraser University

Pfaltz J L (1975) Surface Networks Geo-graPhical Analysis (forthcoming)

Rhynsburger D (1973) Analytic Delineationof Thiessen Polygons Geographical AnalysisVol 5 No2 (Apr) pp 133-144

Rosenfeld A (1969) Picture Processing byComputer New York Academic Press

Schmidt W (1969) The Automap SystemSurveying and Mapping Vol 29 No 1(Mar) pp 101-106

Shepard D (1968) A Two-Dimensional Inter-polation Function for Irregularly SpacedData Proceedings 23rd National ConferenceAssociation for Computing Machinery pp 517-524

Sima J (1972) Prinzipien des CS digitalenGelaendemodells Vermessungstechnik Vol20 No2 pp 48-51

Switzer P (1975) in Sampling of PlanarSurfaces Davis J C and M McCullagh(eds) DisPlay and Analysis of Spatial DataJohn Wiley and Sons New York

Tobler W R (1969) Geographical Filters andtheir Inverses GeograPhical Analysis Vol 1pp 234-253

Warntz W (1966) The Topology of Socio-Economic Terrain and Spatial Flows Papersof the Regional Science Association Vol 17pp47-61 bull

69

Page 15: +>KMIBK>JCD?,>M> :MKN?MNKALkclarke/Geography232/PeuckerChrisman.pdfDelivered by Publishing Technology to: University of California Santa Barbara Library IP: 128.111.234.232 on: Tue,

Del

iver

ed b

y P

ublis

hing

Tec

hnol

ogy

to U

nive

rsity

of C

alifo

rnia

San

ta B

arba

ra L

ibra

ry IP

128

111

234

232

on

Tue

14

May

201

3 23

14

44C

opyr

ight

(c)

Car

togr

aphy

and

Geo

grap

hic

Info

rmat

ion

Soc

iety

All

right

s re

serv

ed

of Multivariate Functions with Applicationsto Topographic Modelling Proceedings Z2ndNational Conference Association for Comput-ing Machinery pp 403-415

Brandstaetter L (1957) Exakte Schichtlinienund Topographische GelaendedarstellungOesterreichische Zeitschrift fuer VermesSlmgs-wesepound Sonderheft 18 Wien 90 pp

Cochrane Douglas (1974) Cartographic Dis-play Operations on Surface Data Collected inan Irregular Structure MA thesis SimonFraser University

Cooke D F and W F Maxfield (1967) TheDevelopment of a Geographic Base File andIts Uses for Mapping Urban and RegionalInformation Systems for Social Programs Pa-pers from the 5th Annual Conference of theUrban and Regional Information System As-sociation pp 207-218

Douglas David H (1973) BNDRYNETPeucker T K (ed) The Interactive Map inUrban Research Final Report After Year OneUniversity of British Columbia

Dueppe R D and H J Gottschalk (1970)Automatische Interpolation von Isolinien beiwillkuerlich verteilten Stuetzpunkten Allge-meine Vermessungs-Nachrichten Vol 129 No10 (Oct) pp 423-426

Grist M W (1972) Digital Ground ModelsAn Account of Recent Research Photogram-metric Record Vol 70 No4 (Oct) pp 424-441

Heiskanen W A and H Moritz (1967) Phys-ical Geodesy W H Freeman San FranciscoChapter 7

Holroyd M T and B K Bhattacharyya(1970) Automatic CO1poundtouring of GeoPhysicalData Using Riclbic Spline Interpolation Geo-logical Survey of Canada Paper No 70-55

Hsu M L et al (1975) Computer Applica-tions in Land Use Mapping and the MinnesotaLand Management Information System Da-vis ] C and M McCullagh (eds) Displayand Analysis of Spatial Data New York JohnWiley and Sons

Junkins ] L G M Miller and J R Jancaitis(1973) A Weighting Function Approach toModelling of Irregular Surfaces Journal ofGeophysical Research Vol 78 No 11 (Apr)pp 1794-1803

Kraus K (1973) A General Digital TerrainModel translation from an article in Acker-mann F (1973) Numerische Photogram-metrie Sammlung Wiechmann Neue FolgeBuchreihe Bd 5 Karlsruhe

Laboratory for Computer Graphics and Spatial

Vol 2 No1 April 1975

Analysis Harvard University (1974) POLY-VRT Manual Cambridge Mass

Mark David M (1974) A Comparison ofComputer-based Terrain Storage MethodsWith Respect to Evaluation of Certain Geo-morphometric Measures MA thesis Uni-versity of British Columbia

Morse S P (1968) A Mathematical Modelfor the Analysis of Contouring Line DataJournal of the Association for Computing Ma-chinery Vol 15 No2 pp 205-220

Nake F and T K Peucker (1972) The Inter-active Map in Urban Research Report afterYear One University of British Columbia

Nordbeck S and B Rystedt (1970) IsarithmicMaps and the Continuity of Reference IntervalFunctions Geografiska Annaler Vol 52 SerB pp 92-123

Peucker T K (1972) Computer CartographyAssociation of American Geographers CollegeGeography Commission Resource Paper No17 Washington DC

Peucker T K (ed) (1973) The InteractiveMap in Urban Research Final Report AfterYear One University of British Columbia

Peucker T K (1974) GeograPhical DataStructltres Report After Year One SimonFraser University

Pfaltz J L (1975) Surface Networks Geo-graPhical Analysis (forthcoming)

Rhynsburger D (1973) Analytic Delineationof Thiessen Polygons Geographical AnalysisVol 5 No2 (Apr) pp 133-144

Rosenfeld A (1969) Picture Processing byComputer New York Academic Press

Schmidt W (1969) The Automap SystemSurveying and Mapping Vol 29 No 1(Mar) pp 101-106

Shepard D (1968) A Two-Dimensional Inter-polation Function for Irregularly SpacedData Proceedings 23rd National ConferenceAssociation for Computing Machinery pp 517-524

Sima J (1972) Prinzipien des CS digitalenGelaendemodells Vermessungstechnik Vol20 No2 pp 48-51

Switzer P (1975) in Sampling of PlanarSurfaces Davis J C and M McCullagh(eds) DisPlay and Analysis of Spatial DataJohn Wiley and Sons New York

Tobler W R (1969) Geographical Filters andtheir Inverses GeograPhical Analysis Vol 1pp 234-253

Warntz W (1966) The Topology of Socio-Economic Terrain and Spatial Flows Papersof the Regional Science Association Vol 17pp47-61 bull

69