Saving the Elephant with Slonik
-
Upload
agnieszka-figiel -
Category
Career
-
view
1.289 -
download
0
description
Transcript of Saving the Elephant with Slonik
![Page 1: Saving the Elephant with Slonik](https://reader033.fdocuments.in/reader033/viewer/2022052823/55508a4fb4c905235b8b4d22/html5/thumbnails/1.jpg)
Saving the Elephant with SlonikAgnieszka Figiel @agnessa480UNEP-WCMC
Railsberry 2013
![Page 2: Saving the Elephant with Slonik](https://reader033.fdocuments.in/reader033/viewer/2022052823/55508a4fb4c905235b8b4d22/html5/thumbnails/2.jpg)
Taxon concepts and ranks
taxon conceptsranks
![Page 3: Saving the Elephant with Slonik](https://reader033.fdocuments.in/reader033/viewer/2022052823/55508a4fb4c905235b8b4d22/html5/thumbnails/3.jpg)
A brief history of gorilla classification
Author & Year Scientific name
Savage1847
Troglodytes gorilla(Pan gorilla)
I. Geoffroy St. Hilaire 1952
Gorilla gorilla
Tuttle1967
Pan gorilla
Groves1967
Gorilla gorilla gorilla
homonym
synonym
split / merge
![Page 4: Saving the Elephant with Slonik](https://reader033.fdocuments.in/reader033/viewer/2022052823/55508a4fb4c905235b8b4d22/html5/thumbnails/4.jpg)
A matter of opinion
Taxonomy A:Loxodonta africana
Taxonomy B:Loxodonta africanaLoxodonta cyclotis
![Page 5: Saving the Elephant with Slonik](https://reader033.fdocuments.in/reader033/viewer/2022052823/55508a4fb4c905235b8b4d22/html5/thumbnails/5.jpg)
#1: CTE's
WITH name [ ( columns) ] AS ( attached query)primary query
![Page 6: Saving the Elephant with Slonik](https://reader033.fdocuments.in/reader033/viewer/2022052823/55508a4fb4c905235b8b4d22/html5/thumbnails/6.jpg)
![Page 7: Saving the Elephant with Slonik](https://reader033.fdocuments.in/reader033/viewer/2022052823/55508a4fb4c905235b8b4d22/html5/thumbnails/7.jpg)
WITH endemic_taxon_concepts AS ( SELECT taxon_concept_id FROM distributions GROUP BY taxon_concept_id HAVING COUNT(*) = 1), countries_with_endemic_distributions AS ( SELECT d.geo_entity_id, COUNT(d.taxon_concept_id) AS cnt FROM distributions d INNER JOIN endemic_taxon_concepts q ON d.taxon_concept_id = q.taxon_concept_id GROUP BY d.geo_entity_id)SELECT geo_entities.name_en, cntFROM countries_with_endemic_distributions qINNER JOIN geo_entities ON geo_entities.id = q.geo_entity_idORDER BY cnt DESC
![Page 8: Saving the Elephant with Slonik](https://reader033.fdocuments.in/reader033/viewer/2022052823/55508a4fb4c905235b8b4d22/html5/thumbnails/8.jpg)
name cnt
Indonesia 1353
Mexico 1069
Madagascar 970
Australia 886
Brazil 763
Ecuador 564
Papua New Guinea 561
South Africa 532
United States of America 520
![Page 9: Saving the Elephant with Slonik](https://reader033.fdocuments.in/reader033/viewer/2022052823/55508a4fb4c905235b8b4d22/html5/thumbnails/9.jpg)
Data-modifying CTE's
WITH deactivated_geo_entities AS ( UPDATE geo_entities SET is_active = FALSE WHERE id IN (#{old_geo_entity_ids}) RETURNING id)UPDATE distributionsSET geo_entity_id = #{new_geo_entity_id}FROM deactivated_geo_entitiesWHERE distributions.geo_entity_id = deactivated_geo_entities.id
CTE = materialize by design
![Page 10: Saving the Elephant with Slonik](https://reader033.fdocuments.in/reader033/viewer/2022052823/55508a4fb4c905235b8b4d22/html5/thumbnails/10.jpg)
#2: Recursive CTE's
WITH RECURSIVE name [ (columns) ] AS ( non-recursive term
UNION [ALL]
recursive term)primary query
![Page 11: Saving the Elephant with Slonik](https://reader033.fdocuments.in/reader033/viewer/2022052823/55508a4fb4c905235b8b4d22/html5/thumbnails/11.jpg)
WITH RECURSIVE self_and_descendants (id, full_name) AS ( SELECT id, full_name FROM taxon_concepts WHERE id = 472 UNION SELECT hi.id, hi.full_name FROM taxon_concepts hi JOIN self_and_descendants d ON d.id = hi.parent_id)SELECT COUNT(*) FROM self_and_descendants
count
432
![Page 12: Saving the Elephant with Slonik](https://reader033.fdocuments.in/reader033/viewer/2022052823/55508a4fb4c905235b8b4d22/html5/thumbnails/12.jpg)
WITH RECURSIVE self_and_ancestors ( parent_id, full_name, level) AS ( SELECT parent_id, full_name, 1 FROM taxon_concepts WHERE id = 5563 UNION SELECT hi.parent_id, hi.full_name, q.level + 1 FROM taxon_concepts hi JOIN self_and_ancestors q ON hi.id = q.parent_id )SELECT full_nameFROM self_and_ancestors ORDER BY level DESC
![Page 13: Saving the Elephant with Slonik](https://reader033.fdocuments.in/reader033/viewer/2022052823/55508a4fb4c905235b8b4d22/html5/thumbnails/13.jpg)
WITH crocodile_ancestry AS ( WITH RECURSIVE self_and_ancestors ( -- [AS IN PREVIOUS SLIDE] ))SELECT ARRAY_TO_STRING(ARRAY_AGG(full_name), ' > ')AS breadcrumb FROM crocodile_ancestry
breadcrumb
Animalia > Chordata > Reptilia > Crocodylia > Crocodylidae > Crocodylus > Crocodylus niloticus
![Page 14: Saving the Elephant with Slonik](https://reader033.fdocuments.in/reader033/viewer/2022052823/55508a4fb4c905235b8b4d22/html5/thumbnails/14.jpg)
Cascade with exceptions
![Page 15: Saving the Elephant with Slonik](https://reader033.fdocuments.in/reader033/viewer/2022052823/55508a4fb4c905235b8b4d22/html5/thumbnails/15.jpg)
![Page 16: Saving the Elephant with Slonik](https://reader033.fdocuments.in/reader033/viewer/2022052823/55508a4fb4c905235b8b4d22/html5/thumbnails/16.jpg)
WITH RECURSIVE cascading_refs(taxon_concept_id, exclusions) AS ( SELECT h.id, h_refs.excluded_taxon_concepts_ids FROM taxon_concepts h LEFT JOIN taxon_concept_references h_refs ON h_refs.taxon_concept_id = h.id WHERE h.id = 10 AND h_refs.reference_id = 369
UNION
SELECT hi.id, cascading_refs.exclusions FROM taxon_concepts hi JOIN cascading_refs ON cascading_refs.taxon_concept_id = hi.parent_id WHERE NOT COALESCE(cascading_refs.exclusions, ARRAY[]::INT[]) @> ARRAY[hi.id])UPDATE taxon_concepts SET has_std_ref = TRUEFROM cascading_refsWHERE cascading_refs.taxon_concept_id = taxon_concepts.id
![Page 17: Saving the Elephant with Slonik](https://reader033.fdocuments.in/reader033/viewer/2022052823/55508a4fb4c905235b8b4d22/html5/thumbnails/17.jpg)
#3: Window functions
SELECT ROW_NUMBER() OVER(ORDER BY full_name), full_name FROM taxon_conceptsWHERE parent_id = 335 ORDER BY full_name
row_number full_name
1 Canis
2 Cerdocyon
3 Chrysocyon
4 Cuon
5 Dusicyon
![Page 18: Saving the Elephant with Slonik](https://reader033.fdocuments.in/reader033/viewer/2022052823/55508a4fb4c905235b8b4d22/html5/thumbnails/18.jpg)
WITH RECURSIVE q(id, full_name, path) AS ( SELECT id, full_name, ARRAY[1] FROM taxon_concepts h WHERE id = 335 UNION SELECT hi.id, hi.full_name, q.path || ( ROW_NUMBER() OVER( PARTITION BY parent_id ORDER BY hi.full_name ) )::INT FROM taxon_concepts hi JOIN q ON hi.parent_id = q.id)SELECT path, full_name FROM qORDER BY path
CTE + window function
![Page 19: Saving the Elephant with Slonik](https://reader033.fdocuments.in/reader033/viewer/2022052823/55508a4fb4c905235b8b4d22/html5/thumbnails/19.jpg)
path full_name
{1} Canidae
{1,1} Canis
{1,1,1} Canis adustus
{1,1,2} Canis aureus
{1,1,3} Canis familiaris
(...)
{1,1,7} Canis lupus
{1,1,7,1} Canis lupus crassodon
{1,1,7,2} Canis lupus dingo
{1,2} Cerdocyon
{1,2,1} Cerdocyon thous
![Page 20: Saving the Elephant with Slonik](https://reader033.fdocuments.in/reader033/viewer/2022052823/55508a4fb4c905235b8b4d22/html5/thumbnails/20.jpg)
With CTE and Windowing,SQL is Turing Complete.
![Page 21: Saving the Elephant with Slonik](https://reader033.fdocuments.in/reader033/viewer/2022052823/55508a4fb4c905235b8b4d22/html5/thumbnails/21.jpg)
SQL Antipatterns: Avoiding the Pitfalls of Database Programming Bill Karwin
PostgreSQL: Up and Running Regina Obe, Leo Hsu
High Performance SQL with PostgreSQL 8.4
https://github.com/unepwcmc/SAPI
Checklist of CITES Species
Biodiversity Information Standards (TDWG)
Items freed into the public domain Pearson Scott Foresman
PostgreSQL
Code & Demo
Graphics
Taxonomy