OSM data in MariaDB / MySQL - All the world in a few large tables

30
OSM data in MariaDB/MySQL All the world in a few large tables Well, almost ... Hartmut Holzgraefe SkySQL AB [email protected] February 1, 2014 Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 1 / 30

description

FOSDEM 2014 talk on importing OpenStreetMap data into MySQL and latest MySQL/MariaDB feature improvements

Transcript of OSM data in MariaDB / MySQL - All the world in a few large tables

Page 1: OSM data in MariaDB / MySQL - All the world in a few large tables

OSM data in MariaDB/MySQLAll the world in a few large tables

Well, almost ...

Hartmut Holzgraefe

SkySQL AB [email protected]

February 1, 2014

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 1 / 30

Page 2: OSM data in MariaDB / MySQL - All the world in a few large tables

Overview

1 MySQL, MariaDB and GIS

2 OpenStreetMap

3 osm2pgsql

4 Examples

5 The End ...

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 2 / 30

Page 3: OSM data in MariaDB / MySQL - All the world in a few large tables

MySQL, MariaDB and GIS

1 MySQL, MariaDB and GISHistoryCurrent StatusRoadmap

2 OpenStreetMap

3 osm2pgsql

4 Examples

5 The End ...

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 3 / 30

Page 4: OSM data in MariaDB / MySQL - All the world in a few large tables

History

First appeard in MySQL 4.1 (2004)

... with MBR relations only

Lab Release adds true spatial relations (200?)

MariaDB 5.3 GA with true spatial relations (2011)

MySQL 5.6 GA with true spatial relations (2013)

... to be continued ...

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 4 / 30

Page 5: OSM data in MariaDB / MySQL - All the world in a few large tables

MBR is not enough

MBR CONTAINS() True ST CONTAINS()

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 5 / 30

Page 6: OSM data in MariaDB / MySQL - All the world in a few large tables

Current Status

Spatial relations work

... but the world is still flat (no projections)

OK for many use cases

... but be aware of gotchas like DISTANCE()

GEOMETRY types in all storage engines

... but only MyISAM has SPATIAL indexes

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 6 / 30

Page 7: OSM data in MariaDB / MySQL - All the world in a few large tables

MariaDB 10.1 Roadmap

System Tables (spatial ref sys, geometry columns, ...)

Precision math calculations and storage

Coordinate transformations / projections

3rd coordinate (e.g. for altitude)

all spatial functions required by OGC

spatial aware optimizer

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 7 / 30

Page 8: OSM data in MariaDB / MySQL - All the world in a few large tables

... and beyond

SPATIAL indexes in other storage engines

3D calculations

client side support for GIS transformations

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 8 / 30

Page 9: OSM data in MariaDB / MySQL - All the world in a few large tables

Openstreetmap

1 MySQL, MariaDB and GIS

2 OpenStreetMapIntroCore Data ModelData AccessData Import

3 osm2pgsql

4 Examples

5 The End ...

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 9 / 30

Page 10: OSM data in MariaDB / MySQL - All the world in a few large tables

OpenStreetMap History

founded in 2004 by SteveCoast

data under open license(CC-BY-SA first, nowODBL)

1.5 million contributors

2 billion map nodes

200 million ways

2 million relations

almost 4 billion GPX points

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 10 / 30

Page 11: OSM data in MariaDB / MySQL - All the world in a few large tables

Pretty Tiles

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 11 / 30

Page 12: OSM data in MariaDB / MySQL - All the world in a few large tables

... and raw data

Raw map data can be used forother things, too:

for routing

for coverage checks

for flight simulators

for science

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 12 / 30

Page 13: OSM data in MariaDB / MySQL - All the world in a few large tables

Core Data Model

Just three simple things

Nodes (Points)

Ways

Relations

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 13 / 30

Page 14: OSM data in MariaDB / MySQL - All the world in a few large tables

Nodes

Nodes describe a single point at a specific location using:

A numeric ID

Object version, Timestamp of last change, User

Node coordinates

Node attributes as key/value pairs

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 14 / 30

Page 15: OSM data in MariaDB / MySQL - All the world in a few large tables

Ways

Ways form an open or closed line by connecting nodes, using:

A numeric ID

Object version, Timestamp of last change, User

An ordered list of node IDs

Way attributes as key/value pairs

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 15 / 30

Page 16: OSM data in MariaDB / MySQL - All the world in a few large tables

Relations

Relations bundle objects to describe more complex relations, using:

A numeric ID

Object version, Timestamp of last change, User

Ordered lists of member nodes, ways and sub-relations

Optional member roles

Attributes as key/value pairs

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 16 / 30

Page 17: OSM data in MariaDB / MySQL - All the world in a few large tables

Data Access

The main database is not exposed directly

Only one central instance, accessible via “The API”

API meant for editor applications only

Full data export once a week (“the planet”)

Plus daily, hourly, minutely diffs

Two file formats for planets:

.osm XML based, usually bz2 compressed (32GB packed, 400GBunpacked)

.pbf compact binary format based on Google ProtoBuf (23GB)

Regional extracts available by 3rd parties, e.g. GeoFabrik.de

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 17 / 30

Page 18: OSM data in MariaDB / MySQL - All the world in a few large tables

Data Import

The raw data is not really suitable for most purposes, esp. rendering

Several import/preprocessing tools provide more convenient schemas,e.g. by

... only extracting certain attributes

... making a difference beween ways and areas

... resolving relations into simpler objects

Besides osm2pgsql that I’m about to talk about in a minute there are alsoimpOsm, ...

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 18 / 30

Page 19: OSM data in MariaDB / MySQL - All the world in a few large tables

osm2pgsql

1 MySQL, MariaDB and GIS

2 OpenStreetMap

3 osm2pgsql

Block DiagramData Model AgainAdding MySQL SupportPerformance

4 Examples

5 The End ...

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 19 / 30

Page 20: OSM data in MariaDB / MySQL - All the world in a few large tables

osm2pgsql

osm2pgsql is a tool

written in C

with a small C++ part now

reads OSM data

preprocesses it

stores results in relational tables

originally in PostGIS only

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 20 / 30

Page 21: OSM data in MariaDB / MySQL - All the world in a few large tables

Block Diagram

Figure : osm2pgsql block diagram

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 21 / 30

Page 22: OSM data in MariaDB / MySQL - All the world in a few large tables

Data Model Again

prefix point for single node POIs

prefix line for linear 2D objects like roads, rivers, power lines ...

prefix roads a subset of the above, optimizer for rendering

prefix polygon objects covering an area: buildings, landuse,administrative borders ...

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 22 / 30

Page 23: OSM data in MariaDB / MySQL - All the world in a few large tables

Adding MySQL Support

turned out to be more tricky than thought

some core parts directly called PostgreSQL functions

a lot of general functionality was hidden in PostgreSQL specificmodules

MySQL output module works, could be faster though

MySQL middle layer is “code complete” ...

... but crashes while processing relations :(

so for now imports are limited by RAM size

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 23 / 30

Page 24: OSM data in MariaDB / MySQL - All the world in a few large tables

Import performance

imports currently take about 4-5 times as long

... as we have no direct equivalent to COPY

osm2pgsql at less than 50% CPU only

... so switching to async API would be a 2x win already

... with multi-insert even more so

index building is faster ...

... but may not be once we get I/O bound

tables on disk are of similar size

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 24 / 30

Page 25: OSM data in MariaDB / MySQL - All the world in a few large tables

Query performance

select count(*)

from nrw_point n

join nrw_polygon p

on st_contains(p.way,n.way)

where p.name = ’Bielefeld’

and n.amenity=’post_box’;

Data Set MySQL 5.5 (MBR) MariaDB 5.5 PostGIS

Germany 15.8s 16.5s ?Northrhine-Westfalia 2.7s 3.1s 6.1ssame with better indexes 0.2s 0.2s .04s

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 25 / 30

Page 26: OSM data in MariaDB / MySQL - All the world in a few large tables

Examples

1 MySQL, MariaDB and GIS

2 OpenStreetMap

3 osm2pgsql

4 Examples

5 The End ...

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 26 / 30

Page 27: OSM data in MariaDB / MySQL - All the world in a few large tables

The End ...

1 MySQL, MariaDB and GIS

2 OpenStreetMap

3 osm2pgsql

4 Examples

5 The End ...

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 27 / 30

Page 28: OSM data in MariaDB / MySQL - All the world in a few large tables

References

Contact [email protected]

MariaDB GIS https://mariadb.com/kb/en/gis-functionality/

MySQL GIS http://dev.mysql.com/doc/refman/5.6/en/spatial-extensions.html

OpenStreetMap http://openstreetmap.org/

MapCompare http://mc.bbbike.org/mc/

RiverMap http://www.kompf.de/gps/rivermap.html

osm2pgsql https://wiki.openstreetmap.org/wiki/Osm2pgsql

My Code https://github.com/hholzgra/osm2pgsql/tree/devel-mysql

Table Files http://php-groupies.de/gis-data/ (soon)

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 28 / 30

Page 29: OSM data in MariaDB / MySQL - All the world in a few large tables

Questions!

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 29 / 30

Page 30: OSM data in MariaDB / MySQL - All the world in a few large tables

The End?Or just the beginning?

Hartmut Holzgraefe (SkySQL) OSM data in MariaDB/MySQL February 1, 2014 30 / 30