Corinne Hutchinson's 7/8/2015 PuPPy Presentation on GeoDjango

17
It’s All About Location, Location, Location Corinne Hutchinson July 8, 2015 -- PuPPY

Transcript of Corinne Hutchinson's 7/8/2015 PuPPy Presentation on GeoDjango

It’s All About Location, Location, Location

Corinne HutchinsonJuly 8, 2015 -- PuPPY

Highlights of Tonight’s Talk

● Reasons for using location data in a web application

● Overview of GeoDjango & setting up a basic web application with geospatial support

● Scaling a system reliant on geospatial queries

Basic GeoSpatial Questions

● Where is A located? (i.e. point-in-polygon, mappability)

● What is the distance between A & B?● What is the shortest path between A & B? (i.e.

route planning)● What’s the elevation change between A & B?

GeoDjango● Included in standard installations of Django since 1.4

● Allows importing of geospatial data from essentially any vector data source (e.g. KML, shapefiles); raster data (bitmaps, etc) are not supported

● Provides familiar ORM interface for geospatial queries

● Straightforward to learn: excellent tutorial on main Django site

Starting a Geospatial Application: DB

● Pick your database and install any needed extensions

● per-DB tutorial in GeoDjango docs, supported options are PostgreSQL/Postgis, MySQL, SQLite, or Oracle

● PostgreSQL/Postgis and Oracle Spatial generally considered the most mature spatial database options

Starting a Geospatial Application: Models

● Define geomodels

● Add admin interface

Adding Data? Lots of Free Geospatial Data Sources

● US Census TIGER (Topologically Integrated Geographic Encoding and

Referencing): political boundaries e.g. states, counties, metro areas

● Natural Earth: natural features

● OpenStreetMap: map tiles, land use, etc

● NASA’s Socioeconomic Data and Applications Center (SEDAC): data about

human-environment interactions e.g. land use, poverty, climate

● Open Topography: topo data, most of the world

Simple Data Import Tools

Modifying/Updating Polygons

● GeoDjango admin provides drag-and-drop editing tools

Using Your App: Making Queries

● Geospatial lookups through ORMo distance

o point-in-polygon

This is great! Can we scale it?

● Minimize direct database hits● Re-route duplicated database calls to cache

(e.g. Redis, MemCache)● Reformat our data for cacheability: geohashes

Geohashes

● Developed by Gustavo Niemeyer, entered into public domain in 2008

● Method of sequentially subdividing the globe into spatial buckets

● Buckets represented as encoded binary strings (e.g. 0010110101011100011000110001101111000111 -> 5pf666y7)

● Allows for very fast point-in-polygon lookups

Examples

Adding Geohashes to Point-in-Polygon Lookups

● Choose level of precision (5 or 6 are likely good)● Convert point to a geohash, then extract the center of that

geohash● DB lookup to determine containing polygon ● Finally, cache the geohash-to-polygon mapping (i.e. set the

key ‘c22zp’ to the value ‘Seattle’)● Subsequent lookups; check cache for existing key

matching geohash before conducting DB lookup

Example:

Take-Aways

● GeoDjango is simple to use● Geohashes are a good tool to help scale

geospatial lookups

More Awesome GeoSpatial Libraries● OGR/GDAL: interacting with geospatial data formats, i.e. opening files, etc● PyShp: ESRI shapefile handling in pure Python (https://pypi.python.org/pypi/pyshp)● PySAL: spatial analysis functions (https://github.com/pysal/pysal)● PyQGIS: essentially anything you might want to do with GIS data (

http://docs.qgis.org/testing/en/docs/pyqgis_developer_cookbook/intro.html)● geopy: geocoding; integration with OpenStreetMaps, Google Geocoding API, Baidu Maps, and

many more (https://github.com/geopy/geopy)● python-geohash: encoding/decoding points to geohashes, looking up geohash neighbors● descartes: plotting geometric objects in matplotlib (https://pypi.python.org/pypi/descartes)● NumPy: data wrangling (http://www.numpy.org/)● pandas: data wrangling (http://pandas.pydata.org/)