Postgis “esri union” on a “unlimited” sized data set
Transcript of Postgis “esri union” on a “unlimited” sized data set
![Page 1: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/1.jpg)
Postgis “esri union”
on a “unlimited”
sized data set
![Page 2: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/2.jpg)
What is “Esri Union”
![Page 3: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/3.jpg)
The Norwegian Institute of Bioeconomy Research
The goal of NIBIO is Contribute to food security, sustainable resourcemanagement, innovation and value creation through research and
knowledge production.
www.nibio.no
![Page 4: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/4.jpg)
Postgis Simple Feature
![Page 5: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/5.jpg)
Why do presentations like this ?
Lars Aksel Opsahl, developer at NIBIO
![Page 6: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/6.jpg)
5/4/16 6Postgis “esri union” on a “unlimited” sized data set
The basic logic
![Page 7: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/7.jpg)
5/4/16 7Postgis “esri union” on a “unlimited” sized data set
The basic logic
![Page 8: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/8.jpg)
5/4/16 8Postgis “esri union” on a “unlimited” sized data set
Esri union on postgres solved now ? No !!!!
![Page 9: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/9.jpg)
Why not dive into Arcgis ?
![Page 10: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/10.jpg)
Why was it important to solve in Postgis ?
![Page 11: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/11.jpg)
The code needs to handle :- Big tables,- Big polygon, - Big servers- Be generic
![Page 12: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/12.jpg)
Unlimited sized table ?Tested 72 million
If primary key INT and result > 2147483647 rows then
fail
![Page 13: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/13.jpg)
5/4/16 13Postgis “esri union” on a “unlimited” sized data set
Divide and conquer pattern
both on big tables and single big polygons
![Page 14: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/14.jpg)
5/4/16 14Postgis “esri union” on a “unlimited” sized data set
Table extent for both table
![Page 15: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/15.jpg)
5/4/16 15Postgis “esri union” on a “unlimited” sized data set
Grid for table one only
![Page 16: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/16.jpg)
5/4/16 16Postgis “esri union” on a “unlimited” sized data set
Grid for table two only
![Page 17: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/17.jpg)
5/4/16 17Postgis “esri union” on a “unlimited” sized data set
Grid cells depends on content from both tables
![Page 18: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/18.jpg)
5/4/16 18Postgis “esri union” on a “unlimited” sized data set
Grid cells depends on content from both tables
![Page 19: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/19.jpg)
Compute Cell size fast or exact ?
![Page 20: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/20.jpg)
&& operator
![Page 21: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/21.jpg)
5/4/16 21Postgis “esri union” on a “unlimited” sized data set
Content based balanced grid
https://github.com/larsop/content_balanced_grid
![Page 22: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/22.jpg)
GRIDS -> MORE CPU USAGEGRIDS -> PARALELL PROCESSING
![Page 23: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/23.jpg)
5/4/16 23Postgis “esri union” on a “unlimited” sized data set
Parallel processing with 8 core dual CPU
1 THREAD2 THREADS
4 THREADS8 THREADS
16 THREADS
0
50
100
150
200
250
300
Time in seconds
![Page 24: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/24.jpg)
USE THE GRID TO DIVIDE AND CONQUER
![Page 25: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/25.jpg)
5/4/16 25Postgis “esri union” on a “unlimited” sized data set
Handle Big Polygons (millions of points)
![Page 26: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/26.jpg)
5/4/16 26Postgis “esri union” on a “unlimited” sized data set
Big Polygons (millions of points)
![Page 27: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/27.jpg)
5/4/16 27Postgis “esri union” on a “unlimited” sized data set
Big Polygon and one single cell
![Page 28: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/28.jpg)
5/4/16 28Postgis “esri union” on a “unlimited” sized data set
Reduce the problem to single SQL
SELECT (ST_intersection(a.geo,b.geom)) as areaFROM sl_lop.helling_data_d1 as a, sl_lop.grid_ar5_helling b WHERE b.id = 18 AND gid = 9419961;
![Page 29: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/29.jpg)
5/4/16 29Postgis “esri union” on a “unlimited” sized data set
Interior rings
SELECT SUM(num_points) FROM (SELECT ST_NumPoints(ST_InteriorRingN(a.geo,generate_series(1, 52079))) as num_points from sl_lop.helling_data_d1 as a where gid = 9419961) as test;Time : 474375.633 ms.
SELECT sum(ST_Numpoints(ST_ExteriorRing(geom))) FROM (SELECT (ST_DumpRings(a.geo)).geom from sl_lop.helling_data_d1 as a where gid = 9419961) as test; Time : 396.959
![Page 30: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/30.jpg)
5/4/16 30Postgis “esri union” on a “unlimited” sized data set
Build a new polygon with interior rings that intersects our current cell
SELECT ST_MakePolygon(exterior.exterior_ring,interior.interior_ring) AS simple_polygon FROM ( SELECT ST_ExteriorRing(a.geo) as exterior_ring FROM sl_lop.helling_data_d1 AS a WHERE a.gid = 9419961 ) as exterior, ( SELECT (array_agg(ST_ExteriorRing(ring))) AS interior_ring FROM ( select (rec).geom as ring, (rec).path[1] as arrayid from ( SELECT ST_DumpRings(a.geo) as rec from sl_lop.helling_data_d1 as a WHERE a.gid = 9419961) as a) as a, sl_lop.grid_ar5_helling b WHERE b.id = 18 AND b.geom && a.ring AND a.arrayid > 0 missing holes later ) as interior ) a
![Page 31: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/31.jpg)
5/4/16 31Postgis “esri union” on a “unlimited” sized data set
Lets wrap SQL into simple function
https://github.com/larsop/esri_union/src/main/sql/function_01_esri_union_intersection.sql
(SELECT (array_agg(ST_ExteriorRing(a.ring))) ASinterior_ring FROM ( SELECT (rec).geom AS ring, (rec).path[1] AS arrayid FROM ( SELECT ST_DumpRings(g1) AS rec ) AS aaa) AS aWHERE g2 && a.ring AND a.arrayid > 0
![Page 32: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/32.jpg)
5/4/16 32Postgis “esri union” on a “unlimited” sized data set
Remove grid lines, find rows
SELECT id_array,id_to_keep FROM ( SELECT * FROM ( SELECT count(r.*) as counts, array_agg(distinct(r.id)) id_array,min(r.id) as id_to_keep FROM sl_lop.res_ar250_sk_grl r WHERE r.t1_sl_sdeid is NOT NULL AND r.t2_komid is NOT NULL GROUP BY r.t1_sl_sdeid, r.t2_komid ) AS t WHERE counts > 1 -- UNION WHERE r.t1_sl_sdeid is NOT NULL AND r.t2_komid is NULL -- UNION WHERE r.t2_komid is NOT NULL AND r.t1_sl_sdeid is NULL AS to_update
![Page 33: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/33.jpg)
5/4/16 33Postgis “esri union” on a “unlimited” sized data set
Remove grid lines, ST_Union and Group byhttps://github.com/larsop/esri_union/ in function_01_esri_union_remove_grid.sql
UPDATE sl_lop.res_ar250_sk_grl AS r SET geom = ST_Multi(u.geom)FROM (SELECT St_Union(r.geom) AS geom FROM sl_lop.res_ar250_sk_grl r WHERE r.id = ANY('{3050,3681}') ) AS uWHERE r.id = '3050';
DELETE FROM sl_lop.res_ar250_sk_grl AS rWHERE r.id = ANY('{3050,3681}') AND r.id > '3050';
![Page 34: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/34.jpg)
Postgis Toplogy and grid lines
![Page 35: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/35.jpg)
Generic code
json_each_text
![Page 36: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/36.jpg)
5/4/16 36Postgis “esri union” on a “unlimited” sized data set
GENERIC CODE, USING EXECUTE ON GENERATED STRING
SELECT array_agg( ''t_''|| %s || ''.'' || quote_ident(update_column) || '' AS '' || '' t'' || %s || ''_'' || quote_ident(update_column)) AS new_column_as_tmp,array_agg('' t'' || %s || ''_'' || quote_ident(update_column)) AS new_column_name_tmp,array_agg(quote_ident(update_column)) AS org_column_name_tmpFROM (SELECT distinct(key) AS update_columnFROM (SELECT * FROM %s limit 1) AS t, json_each_text(to_json((t))) WHERE key != %L) AS keys',i,i,i,schema_table_name,geo_colums_array[i]);
![Page 37: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/37.jpg)
Processing cells i parallel
![Page 38: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/38.jpg)
5/4/16 38Postgis “esri union” on a “unlimited” sized data set
Parallel processing, generate code
psql -h host -U postgres -t -q -o /tmp/t1.sql sl-c"select get_esri_union_muti_thread('org_ar5.ar5_flate sl_sdeid geo','org_helling.hellingklasser_dted10_3soner_flate gidgeo','sl_lop.ar5_helling',3000,'sl_lop.ar5_helling_c1',False,False)"
![Page 39: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/39.jpg)
5/4/16 39Postgis “esri union” on a “unlimited” sized data set
Generated code has different parameters for each cellSELECT get_esri_union_cell('(1,37666,"{""org_ar5.ar5_flate ASt_1"",""org_helling.hellingklasser_dted10_3soner_flate ASt_2""}","{sl_sdeid,gid}","{""sl_sdeid,kartid,noyaktighet,objtype,informasjon,registreringsversjon,synbarhet,maalemetode,datafangstdato,artype,arkartstd,arskogbon,kjoringsident,opphav,verifiseringsdato,argrunnf,areal,artreslag"",""gid,h_klasse,utmsone""}","{""t1_sl_sdeid, t1_kartid, t1_noyaktighet, t1_objtype, t1_informasjon,t1_registreringsversjon, t1_synbarhet, t1_maalemetode, t1_datafangstdato, t1_artype,t1_arkartstd, t1_arskogbon, t1_kjoringsident, t1_opphav, t1_verifiseringsdato,t1_argrunnf, t1_areal, t1_artreslag"","" t2_gid, t2_h_klasse,t2_utmsone""}","{""t_1.sl_sdeid AS t1_sl_sdeid,t_1.kartid AS t1_kartid,t_1.noyaktighetAS t1_noyaktighet,t_1.objtype AS t1_objtype,t_1.informasjon ASt1_informasjon,t_1.registreringsversjon AS t1_registreringsversjon,t_1.synbarhet ASt1_synbarhet,t_1.maalemetode AS t1_maalemetode,t_1.datafangstdato ASt1_datafangstdato,t_1.artype AS t1_artype,t_1.arkartstd AS t1_arkartstd,t_1.arskogbon AS t1_arskogbon,t_1.kjoringsident AS t1_kjoringsident,t_1.opphav ASt1_opphav,t_1.verifiseringsdato AS t1_verifiseringsdato,t_1.argrunnf ASt1_argrunnf,t_1.areal AS t1_areal,t_1.artreslag AS t1_artreslag"",""t_2.gid ASt2_gid,t_2.h_klasse AS t2_h_klasse,t_2.utmsone ASt2_utmsone""}","{geo,geo}","{t_1.geo,t_2.geo}",sl_lop.ar5_helling,sl_lop.ar5_helling_c1,tmp_data_esri_intersects_t1_7ce543ab6a2b5dc0c43af863b659d81a,tmp_data_esri_intersects_t2_7ce543ab6a2b5dc0c43af863b659d81a,tmp_data_esri_intersects_7ce543ab6a2b5dc0c43af863b659d81a)')
![Page 40: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/40.jpg)
5/4/16 40Postgis “esri union” on a “unlimited” sized data set
Parallel processing of each cell
parallel -verbose -j 16 psql -h host -U user sl -c :::: /tmp/t1.sql
![Page 41: Postgis “esri union” on a “unlimited” sized data set](https://reader034.fdocuments.in/reader034/viewer/2022042800/58808bb81a28abcd108bd390/html5/thumbnails/41.jpg)
This code,tests and wiki is found at https://github.com/larsop/esri_union
Thanks