OpenGeo _ Introduction to POstGIS

11
PostGIS PostGIS is an extension to the PostgreSQL relational database that provides spatial types, indexes and functions, following the OGC “Simple Features for SQL” (SFSQL). Starting the Suite You can start and stop the OpenGeo Suite, and access components like PostGIS and GeoServer, via the “Dashboard”. Start the Dashboard from the Start Menu > OpenGeo (Windows) or Applications > OpenGeo (OS/X). When you first start the dashboard, it provides a reminder about the default password for accessing GeoServer. Note The PostGIS database has been installed with unrestricted access for local users (users connecting from the same machine as the database is running). That means that it will accept any password you provide. If you need to connect from a remote computer, the password for the postgres user has been set to postgres. First, we need to start up the Suite (which will start both PostGIS and GeoServer). Click the green Start button at the top right corner of the Dashboard. 1. The first time the Suite starts, it initializes a data area and sets up template databases. This can take a couple minutes. Once the Suite has started, you can click the Manage option under the PostGIS component to start the pgAdmin utility. 2. Table Of Contents PostGIS Starting the Suite Creating a Database Loading Shapes into PostGIS Loading Shapes into PostGIS... Using the Command Line PostGIS System Tables SPATIAL_REF_SYS GEOMETRY_COLUMNS Spatial Queries Measuring Sub-setting Spatial Indexes Spatial Joins Conclusion Continue Reading Previous: Installing PostGIS and GeoServer Next: Installing QGIS About OpenGeo OpenGeo provides commercial open source software for internet mapping and geospatial application development. We are a social enterprise dedicated to the growth and support of open source software. License This work is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License. Feel free to use this material, but we ask that you please retain the OpenGeo branding, logos and style. Products & Services Technology Support Partners About Blog Introduction to an Open Source Geostack Home » Education » Introduction to an Open Source Geostack » PostGIS OpenGeo : Introduction to an Open Source Geostack : PostGIS http://workshops.opengeo.org/stack-intro/postgis.html 1 of 11 07/02/2011 10:35

Transcript of OpenGeo _ Introduction to POstGIS

Page 1: OpenGeo _ Introduction to POstGIS

PostGIS

PostGIS is an extension to the PostgreSQL relational database that provides spatial types,

indexes and functions, following the OGC “Simple Features for SQL” (SFSQL).

Starting the Suite

You can start and stop the OpenGeo Suite, and access components like PostGIS and

GeoServer, via the “Dashboard”.

Start the Dashboard from the Start Menu > OpenGeo (Windows) or Applications >

OpenGeo (OS/X).

When you first start the dashboard, it provides a reminder about the default password for

accessing GeoServer.

Note

The PostGIS database has been installed with unrestricted access for local users (users

connecting from the same machine as the database is running). That means that it will

accept any password you provide. If you need to connect from a remote computer, the

password for the postgres user has been set to postgres.

First, we need to start up the Suite (which will start both PostGIS and GeoServer). Click

the green Start button at the top right corner of the Dashboard.

1.

The first time the Suite starts, it initializes a data area and sets up template databases.

This can take a couple minutes. Once the Suite has started, you can click the Manage

option under the PostGIS component to start the pgAdmin utility.

2.

Table Of Contents

PostGIS

Starting the Suite

Creating a Database

Loading Shapes into PostGIS

Loading Shapes into PostGIS...

Using the Command Line

PostGIS System Tables

SPATIAL_REF_SYS

GEOMETRY_COLUMNS

Spatial Queries

Measuring

Sub-setting

Spatial Indexes

Spatial Joins

Conclusion

Continue Reading

Previous: Installing PostGIS and

GeoServer

Next: Installing QGIS

About OpenGeo

OpenGeo provides commercial open

source software for internet mapping and

geospatial application development. We

are a social enterprise dedicated to the

growth and support of open source

software.

License

This work is licensed under a Creative

Commons Attribution-Share Alike 3.0

United States License. Feel free to use this

material, but we ask that you please retain

the OpenGeo branding, logos and style.

Products & Services

Technology

Support

Partners

About

Blog

Introduction to an Open Source Geostack

Home » Education » Introduction to an Open Source Geostack » PostGIS

OpenGeo : Introduction to an Open Source Geostack : PostGIS http://workshops.opengeo.org/stack-intro/postgis.html

1 of 11 07/02/2011 10:35

Page 2: OpenGeo _ Introduction to POstGIS

Note

PostgreSQL has a number of administrative front-ends. The primary is psql a

command-line tool for entering SQL queries. Another popular PostgreSQL front-end is

the free and open source graphical tool pgAdmin. All queries done in pgAdmin can also

be done on the command line with psql.

If this is the first time you have run pgAdmin, you should have a server entry for PostGIS

(localhost:54321) already configured in pgAdmin. Double click the entry, and enter

anything you like at the password prompt to connect to the database.

Note

If you have a previous installation of PgAdmin on your computer, you will not have an

entry for (localhost:54321). You will need to create a new connection. Go to File >

Add Server, and register a new server at localhost and port 54321 (note the

non-standard port number) in order to connect to the PostGIS bundled with the

OpenGeo Suite.

3.

Creating a Database

PostgreSQL has the notion of a template database that can be used to initialize a new

database – the new database automatically gets a copy of everything from the template. When

you installed PostGIS, a spatially enabled database called template_postgis was created.

If we use template_postgis as a template when creating our new database, the new

database will be spatially enabled.

Open the Databases tree item and have a look at the available databases. The

postgres database is the user database for the default postgres user and is not too

interesting to us. The template_postgis database is what we are going to use to

create spatial databases.

1.

Right-click on the Databases item and select New Database.2.

OpenGeo : Introduction to an Open Source Geostack : PostGIS http://workshops.opengeo.org/stack-intro/postgis.html

2 of 11 07/02/2011 10:35

Page 3: OpenGeo _ Introduction to POstGIS

Note

If you receive an error indicating that the source database (template_postgis) is

being accessed by other users, this is likely because you still have it selected.

Right-click on the PostGIS (localhost:54321) item and select Disconnect.

You can then double-click the same item to reconnect and try again.

Fill in the New Database form as shown below and click OK.

Name postgis

Owner postgres

Encoding UTF8

Template template_postgis

3.

Select the new postgis database and open it up to display the tree of objects. You’ll

see the public schema, and under that a couple of PostGIS-specific metadata tables –

geometry_columns and spatial_ref_sys – which we will discuss later.

4.

OpenGeo : Introduction to an Open Source Geostack : PostGIS http://workshops.opengeo.org/stack-intro/postgis.html

3 of 11 07/02/2011 10:35

Page 4: OpenGeo _ Introduction to POstGIS

Click on the SQL query button indicated below (or go to Tools > Query Tool).5.

Enter the following query into the query text field:

SELECT postgis_full_version();

Note

This is our first SQL query. postgis_full_version() is management function

that returns version and build configuration.

6.

Click the Play button in the toolbar (or press F5) to “Execute the query”. The query will

return the following string, confirming that PostGIS is properly enabled in the database.

7.

You have successfully created a PostGIS spatial database!! Now do a spatial calculation

just to make sure. Copy the following into the SQL window:

SELECT ST_Length('LINESTRING(0 0, 1 1)');

Our first spatial query constructs a diagonal line across a one-unit square. The length of

that line is sqrt(2), or 1.4142.

8.

Loading Shapes into PostGIS

The workshop data files are public domain data from the City of Medford, Oregon. The files are

located in the data/ directory of the workshop. The projection of the data is NAD83 State Plane

(Oregon South) in feet, more succinctly and opaquely known as EPSG:2270. The files are:

school_pt.shp a small point file of school locations

road_ln.shp a large line file of street centerlines

taxlot_ply.shp a large polygon file of taxable property parcels

We will load our example data into PostGIS using the pgShapeLoader tool in to convert from

Shape files to PostGIS tables.

From the PgAdmin Plugins menu, select PostGIS Shapefile and DBF loader.1.

OpenGeo : Introduction to an Open Source Geostack : PostGIS http://workshops.opengeo.org/stack-intro/postgis.html

4 of 11 07/02/2011 10:35

Page 5: OpenGeo _ Introduction to POstGIS

The loader still start with the connection information for your current PgAdmin database.

Click the “Test connection...” button to ensure you can connect to the database.

Now, click on the button in the “Shape File” area, and browse to the data directory.

Select the “school_pt.shp” file, and click “Open”.

2.

Next, change the value of the SRID field to 2270.3.

Finally, click the “Import” button to start the process.4.

Repeat the process for “road_ln.shp” and “taxlot_ply.shp”. These are much larger files. To

make the load process go faster, open the “Options...” dialogue and click the “Load using

COPY rather than INSERT” option on before running the import.

5.

Loading Shapes into PostGIS... Using the Command Line

OpenGeo : Introduction to an Open Source Geostack : PostGIS http://workshops.opengeo.org/stack-intro/postgis.html

5 of 11 07/02/2011 10:35

Page 6: OpenGeo _ Introduction to POstGIS

PostGIS ships with a command-line utility for loading shape files into the database, called

shp2pgsql, as well as a utility for exporting tables to shape files, call pgsql2shp.

If you completed the process with PostGIS Shapefile and DBF loader above, you do not

need to run these commands – the data is already loaded into your database.

Enter the workshop data directory, set the PATH environment variable to include the

PostgreSQL executables directory, and then run the data loading commands. shp2gpsql

converts the shape file into a SQL text file suitable for loading into the database. psql loads

the text file into the target database.

# set PATH=%PATH%;C:\Program Files\OpenGeo\OpenGeo Suite\pgsql\8.4\bin

# shp2pgsql -p 54321 -I -s 2270 -D road_ln.shp road_ln > road_ln.sql

# psql -f road_ln.sql -d postgis

# shp2pgsql -p 54321 -I -s 2270 -D taxlot_ply.shp taxlot_ply > taxlot_ply.sql

# psql -f taxlot_ply.sql -d postgis

# shp2pgsql -p 54321 -I -s 2270 -D school_pt.shp school_pt > school_pt.sql

# psql -f school_pt.sql -d postgis

PostGIS System Tables

PostGIS follows the OGC SFSQL (Simple Features for SQL) specification, which means it

includes two standard system tables of metadata: SPATIAL_REF_SYS and

GEOMETRY_COLUMNS.

SPATIAL_REF_SYS

The SPATIAL_REF_SYS table contains information about “spatial reference systems” –

combinations of geographic systems (ellipsoids, datum) and projected systems (projections,

parameters) that are used for real-world mapping. “Transverse mercator” is an example of a

projection, and WGS84 is an example of a spheroid, but “UTM Zone 10 North, NAD 83” is an

example of a full spatial reference system.

Table "public.spatial_ref_sys"

Column | Type | Modifiers

-----------+-------------------------+-----------

srid | integer | not null

auth_name | character varying(256) |

auth_srid | integer |

srtext | character varying(2048) |

proj4text | character varying(2048) |

Indexes:

"spatial_ref_sys_pkey" PRIMARY KEY, btree (srid)

Each row in the SPATIAL_REF_SYS table corresponds to one spatial reference system. The

srid column is the unique identifier, and is considered “internal” to the database. The

auth_name and auth_srid are the external authority and authority number. The authority is

usually “EPSG” and the table that ships with PostGIS matches the srid to the auth_srid for

convenience.

The srtext is the OGC “well-known text” representation of the spatial reference system. The

proj4text is the representation consumed by the Proj.4 reprojection library PostGIS uses to

provide on-the-fly reprojection. Because only the proj4text is used internally by PostGIS, it

is usually safe to omit the srtext when adding new entries, but be aware that external

programs may use the srtext to determine the projection of a particular table.

GEOMETRY_COLUMNS

The GEOMETRY_COLUMNS table contains information about the spatial columns in a database.

Table "public.geometry_columns"

Column | Type | Modifiers

-------------------+------------------------+-----------

f_table_catalog | character varying(256) | not null

f_table_schema | character varying(256) | not null

f_table_name | character varying(256) | not null

f_geometry_column | character varying(256) | not null

coord_dimension | integer | not null

srid | integer | not null

type | character varying(30) | not null

Each row in the table corresponds to one spatial column. Tables may have multiple spatial

columns. Client software such as QGIS and uDig often use the GEOMETRY_COLUMNS table to

figure out which columns to display to the end user as “layers” suitable for viewing on a map.

The first four columns (f_table_catalog, f_table_schema, f_table_name,

f_geometry_column) serve to uniquely locate the geometry column. The next three

describe the spatial metadata:

coord_dimension provides the dimensionality (2, 3, or 4 dimensions are supported in

PostGIS);

srid provides the spatial reference system and must refer to a valid row in the

SPATIAL_REF_SYS table;

OpenGeo : Introduction to an Open Source Geostack : PostGIS http://workshops.opengeo.org/stack-intro/postgis.html

6 of 11 07/02/2011 10:35

Page 7: OpenGeo _ Introduction to POstGIS

type provides the geometry type (point, linestring, polygon, etc).

Note that the GEOMETRY_COLUMNS table is not automatically updated as you create and drop

tables. You must manually keep it up to date.

One way to keep the table up-to-date is to religiously use the AddGeometryColumn()

function when managing DDL in spatial tables. This function takes in all the information

necessary to create a new column, performs the creation, and adds a metadata record:

SELECT AddGeometryColumn(

'public',

'mytable',

'mygeocolumn',

2,

4326,

'POLYGON'

);

Another way to keep the table up-to-date is to use helper functions. PostGIS 1.4 and higher

provide the Populate_Geometry_Columns() function, which checks for validity and also

fills in missing entries.

-- PostGIS 1.4

SELECT Populate_Geometry_Columns();

populate_geometry_columns

-------------------------------------------

probed:3 inserted:3 conflicts:0 deleted:0

(1 row)

Spatial Queries

We will now construct some queries of our spatial database, using “spatial SQL” functions

provided by PostGIS (and any other SFSQL spatial database). For a reference list of functions

we will be using, see the PostGIS Functions section.

Measuring

The taxlot_ply table contains 91,343 parcel polygons. It also includes a large number of

attributes about each parcel, including:

impvalue (improvement value)

landvalue (land value)

acreage (reported acreage)

yearblt (year built)

feeowner (name of the owner)

state (state of residence of the owner)

We can use the ST_Area() function in combination with these attributes to ask some

questions of the taxlot_ply table. Open the PgAdmin SQL window and enter the following

queries into database.

What is the area in acres of all parcels in the database?

SELECT Sum(ST_Area(the_geom)) / 43560

FROM taxlot_ply;

Answer: 1772888

What is the area in acres of parcels built on since 2000?

OpenGeo : Introduction to an Open Source Geostack : PostGIS http://workshops.opengeo.org/stack-intro/postgis.html

7 of 11 07/02/2011 10:35

Page 8: OpenGeo _ Introduction to POstGIS

SELECT Sum(ST_Area(the_geom)) / 43560

FROM taxlot_ply

WHERE yearblt >= 2000;

Answer: 27176

What is the value per square foot of all parcels?

SELECT Sum(landvalue + impvalue) / Sum(ST_Area(the_geom)) as

FROM taxlot_ply;

Answer: 0.41

What is the value per square foot of all parcels held by out-of-state owners?

SELECT Sum(landvalue + impvalue) / Sum(ST_Area(the_geom)) as

FROM taxlot_ply

WHERE state != 'OR';

Answer: 0.38

Measurement is not limited to areas. We can also use linear measurements to characterize the

roads in the county.

What is the break down of road types in the county?

SELECT

Sum(ST_Length(the_geom)) / 5280 as miles,

Count(*) as nsegments,

cfcc

FROM road_ln

GROUP BY cfcc

ORDER by cfcc;

Sub-setting

So far, our queries have calculated one metric or a summary against every record in the

database. Databases are commonly used to store very large tables – larger than can be stored

in memory – and efficiently access sub-sets of those tables.

First, let’s find out the coordinates of the first school in our school_pt table:

SELECT ST_AsText(the_geom) FROM school_pt WHERE gid = 1;

Answer: POINT(4387009 402407)

Now, let’s take that point, and find the average property value in a one-mile (5280 foot) radius.

SELECT Sum(landvalue + impvalue) / Count(*) as avg_value

FROM taxlot_ply

WHERE

ST_DWithin(

the_geom,

ST_GeomFromText('POINT(4387009 402407)', 2270),

5280

);

Answer: 161,094

OpenGeo : Introduction to an Open Source Geostack : PostGIS http://workshops.opengeo.org/stack-intro/postgis.html

8 of 11 07/02/2011 10:35

Page 9: OpenGeo _ Introduction to POstGIS

There are a number of things going on in this query:

The ST_GeomFromText() function is used to build a geometry object from the text

representation of a point. Note that the SRID is also set to 2270 at the same time, to

match the SRID of our data tables.

The ST_DWithin() function is then used to test every geometry against the query

point, and return true only if the geometry was within 5280 units (feet).

Finally, only those records that passed the distance test were fed into the calculation of

the average property value: total value divided by number of properties.

Spatial Indexes

The PostGIS spatial index is an r-tree index, implemented on top of PostgreSQL’s GiST access

method infrastructure.

An “r-tree” (and any other spatial index) works by sorting the bounding boxes of features into a

quickly searchable tree. Because the features themselves are not indexed, just the bounding

boxes, all queries that use spatial indexes must proceed in two phases. First, the spatial index

is used to generate a subset of records that might match a spatial condition; then, an exact test

is used on just that subset to produce the final output set.

The “r-tree” index uses nested rectangles (in the two-dimensional case, cubes and hypercubes

for higher dimensions) to sort the features into a quickly searchable tree.

To create a spatial index in PostGIS, use the CREATE INDEX [indexname] ON

[tablename] USING GIST ( [geometry] ) command. For example, to index our three

example tables, you would use the following commands.

Let’s compare an unindexed and indexed query for speed.

First, drop the spatial indexes on your tables.

DROP INDEX school_pt_the_geom_gist;

DROP INDEX taxlot_ply_the_geom_gist;

DROP INDEX road_ln_the_geom_gist;

1.

Run the average property query, and see how fast it executes:

SELECT Sum(landvalue + impvalue) / Count(*) as avg_value

FROM taxlot_ply

WHERE

ST_DWithin(

the_geom,

ST_GeomFromText('POINT(4387009 402407)', 2270),

5280

);

2.

Now, add the spatial indexes back onto your tables, and run the query again.

CREATE INDEX school_pt_the_geom_gist ON school_pt USING GIST (the_geom

CREATE INDEX taxlot_ply_the_geom_gist ON taxlot_ply USING GIST

CREATE INDEX road_ln_the_geom_gist ON road_ln USING GIST (the_geom

3.

The unindexed query logs an execution time of over 1000ms, while with the indexes, a time of

less than 50ms is achieved.

Spatial Joins

With spatial indexes in place, we can perform spatial joins quickly – taking information from two

OpenGeo : Introduction to an Open Source Geostack : PostGIS http://workshops.opengeo.org/stack-intro/postgis.html

9 of 11 07/02/2011 10:35

Page 10: OpenGeo _ Introduction to POstGIS

Previous: Installing PostGIS and GeoServer Next: Installing QGIS

distinct tables and joining it together on the basis of spatial relationships.

Our last query determined the average property value within a one-mile radius of a single

school. We can use a spatial join to determine the property value within a one-mile radius

for all schools. Or, to keep the result set smaller, just the high schools.

SELECT

s.name AS school_name,

Sum(t.landvalue + t.impvalue) / Count(*) AS avg_property_value

FROM taxlot_ply t, school_pt s

WHERE

ST_DWithin(t.the_geom, s.the_geom, 5280)

AND

s.type = 'High School'

GROUP BY s.name

ORDER BY avg_property_value DESC;

And now we know where to send our kids to school in Medford.

Conclusion

These have been a very few examples of using spatial SQL for querying a database. In the

remaining sections of the workshop, most of the querying will happen behind the scenes, as

tools like GeoServer pull data from the database.

However, the power of the spatial database for analysis and querying remains easily available

via scripting languages and direct user tools like PgAdmin to quickly analyze or automate

geospatial tasks.

Products & Services

OpenGeo Suite

Learn

Features

Screenshots & Videos

Download

Purchase

Pricing

Training

Consulting

Solutions

OpenGeo for Government

OpenGeo for Transit

Commercial Solutions

Support

Partners

Partner Terms

Partner FAQ

Technology

OpenGeo Suite

GeoNode

PostGIS

GeoServer

GeoWebCache

OpenLayers

GeoExt

Demos

Publications & Case Studies

About Us

History

Philosophy

Team

Careers

Contact

Press

Blog

OpenGeo is the geospatial division of

OpenPlans, a 501(c)(3) not-for-profit. We're

bringing the best practices of open source

software to organizations around the world.

148 Lafayette Street, Penthouse

New York, NY 10013

1-877-OPENGEO

Subscribe to our newsletter

Follow @opengeo on Twitter

Follow us on LinkedIn

OpenGeo : Introduction to an Open Source Geostack : PostGIS http://workshops.opengeo.org/stack-intro/postgis.html

10 of 11 07/02/2011 10:35

Page 11: OpenGeo _ Introduction to POstGIS

OpenGeo : Introduction to an Open Source Geostack : PostGIS http://workshops.opengeo.org/stack-intro/postgis.html

11 of 11 07/02/2011 10:35