Going beyond Django ORM limitations with Postgres

Post on 10-May-2015

6.972 views 4 download

Tags:

Transcript of Going beyond Django ORM limitations with Postgres

G!"#$ b%&!#' ()% D*+#$! ORM

,"-"(+("!#. w"() P/($r%.

@0r+"$1%r.("%#.

Postgres.app

PSA: Macs

Why Postgres

http://www.craigkerstiens.com/2012/04/30/why-postgres/https://speakerdeck.com/craigkerstiens/postgres-demystified

“Its the emacs of databases”

DatatypesConditional IndexesTransactional DDL

Foreign Data WrappersConcurrent Index Creation

ExtensionsCommon Table Expressions

Fast Column AdditionListen/Notify

Table InheritancePer Transaction sync replication

Window functionsNoSQL inside SQL

Momentum

TLDR

Limitations?Django attempts to support as many features as possible on all database backends. However, not all database backends are alike, and we’ve had to make design decisions on which features to support and which assumptions we can make safely.

D+(+(2%.

__________ < Postgres > ---------- \ ^__^ \ (oo)\_______ (__)\ )\/\ ||----w | || ||

Datatypes

smallint bigint integer

numeric floatserial money char

varchartext

bytea

timestamp

timestamptz date

timetimetz

interval boolean

enum

pointline

polygon

box

circle

path

inetcidr

macaddr tsvector

tsquery

arrayXML

UUID

Datatypes

smallint bigint integer

numeric floatserial money char

varchartext

bytea

timestamp

timestamptz date

timetimetz

interval boolean

enum

pointline

polygon

box

circle

pathcidr

macaddr tsvector

tsquery

arrayXML

UUID

inet

Datatypes

smallint bigint integer

numeric floatserial money char

varchartext

bytea

timestamp

timestamptz date

timetimetz

interval boolean

enum

pointline

polygon

box

circle

pathcidr

macaddr tsvector

tsquery

arrayXML

UUID

inet

Datatypes

smallint bigint integer

numeric floatserial money char

varchartext

bytea

timestamp

timestamptz date

timetimetz

interval boolean

enum

pointline

polygon

box

circle

pathcidr

macaddr tsvector

tsquery

arrayXML

UUID

inet

_______ < SQLite > --------- \ ^__^ \ (oo)\_______ (__)\ )\/\ ||----w | || ||

Datatypesnull

realtext

blob

integer

Datatypes

smallint bigint integer

numeric floatserial money char

varchartext

bytea

timestamp

timestamptz date

timetimetz

interval boolean

enum

pointline

polygon

box

circle

path

inetcidr

macaddr tsvector

tsquery

arrayXML

UUID

I#'3%.

________ < Postgres > ---------- \ ^__^ \ (oo)\_______ (__)\ )\/\ ||----w | || ||

IndexesMultiple Types

B-Tree, Gin, Gist, KNN, SP-Gist

CREATE INDEX CONCURRENTLY

_____ < MySQL > ------- \ ^__^ \ (oo)\_______ (__)\ )\/\ ||----w | || ||

IndexesThey exist

Digging In

ArraysCREATE TABLE item ( id serial NOT NULL, name varchar (255), tags varchar(255) [], created_at timestamp);

ArraysCREATE TABLE item ( id serial NOT NULL, name varchar (255), tags varchar(255) [], created_at timestamp);

ArraysINSERT INTO itemVALUES (1, 'Django Pony', '{“Programming”,”Animal”}', now());

INSERT INTO item VALUES (2, 'Ruby Gem', '{“Programming”,”Jewelry”}', now());

ArraysINSERT INTO itemVALUES (1, 'Django Pony', '{“Programming”,”Animal”}', now());

INSERT INTO item VALUES (2, 'Ruby Gem', '{“Programming”,”Jewelry”}', now());

Djangopip install djorm-ext-pgarraypip install djorm-ext-expressions

models.py:

from djorm_pgarray.fields import ArrayFieldfrom djorm_expressions.models import ExpressionManager

class Item(models.Model): name = models.CharField(max_length=255) tags = ArrayField(dbtype="varchar(255)") created_at = models.DateTimeField(auto_now=True)

Djangopip install djorm-ext-pgarraypip install djorm-ext-expressions

models.py:

from djorm_pgarray.fields import ArrayFieldfrom djorm_expressions.models import ExpressionManager

class Item(models.Model): name = models.CharField(max_length=255) tags = ArrayField(dbtype="varchar(255)") created_at = models.DateTimeField(auto_now=True)

Djangoi = Item( name='Django Pony', tags=['Programming','Animal'])i.save()

qs = Item.objects.where( SqlExpression("tags", "@>", ['programming']))

Djangoi = Item( name='Django Pony', tags=['Programming','Animal'])i.save()

qs = Item.objects.where( SqlExpression("tags", "@>", ['programming']))

dblink hstore

citext

ltreeisncube

pgcrypto

tablefunc

uuid-ossp

earthdistance

trigram

fuzzystrmatch

pgrowlocks

pgstattuple

btree_gist

dict_intdict_xsynunaccent

Extensions

NoSQL in your SQL

NoSQL in your SQLCREATE EXTENSION hstore;CREATE TABLE item ( id integer NOT NULL, name varchar(255), data hstore,);

NoSQL in your SQLCREATE EXTENSION hstore;CREATE TABLE item ( id integer NOT NULL, name varchar(255), data hstore,);

NoSQL in your SQLINSERT INTO items VALUES (1, 'Pony', 'rating => "4.0", color => “Pink”', );

NoSQL in your SQLINSERT INTO items VALUES (1, 'Pony', 'rating => "4.0", color => “Pink”', );

Djangopip install django-hstore

from django.db import modelsfrom django_hstore import hstore

class Item(models.Model): name = models.CharField(max_length=250) data = hstore.DictionaryField(db_index=True) objects = hstore.Manager()

def __unicode__(self): return self.name

Djangopip install django-hstore

from django.db import modelsfrom django_hstore import hstore

class Item(models.Model): name = models.CharField(max_length=250) data = hstore.DictionaryField(db_index=True) objects = hstore.Manager()

def __unicode__(self): return self.name

DjangoItem.objects.create( name='Django Pony', data={'rating': '5'})

Item.objects.create( name='Pony', data={'color': 'pink', 'rating': '4'})

DjangoItem.objects.create( name='Django Pony', data={'rating': '5'})

Item.objects.create( name='Pony', data={'color': 'pink', 'rating': '4'})

Djangocolored_ponies = Product.objects.filter(data__contains='color')print colored_ponies[0].data['color']

favorite_ponies = Product.objects.filter(data__contains={'rating': '5'})print colored_ponies[0]

Djangocolored_ponies = Product.objects.filter(data__contains='color')print colored_ponies[0].data['color']

favorite_ponies = Product.objects.filter(data__contains={'rating': '5'})print colored_ponies[0]

JSON

JSON

w/ PLV8

Q4%4%"#$

Pub/Sub?

Postgres a great Queue

Postgres a great Queue

Not With Polling

Trunkpip install celery trunk

psql < sql/*.sqlcelery worker -A tasks --loglevel=info

ipython -i tasks.py>>> add.delay(2, 2)

Trunkpip install celery trunk

psql < sql/*.sqlcelery worker -A tasks --loglevel=info

ipython -i tasks.py>>> add.delay(2, 2)

Trunkfrom celery import Celery

celery = Celery('tasks')celery.config_from_object({ 'BROKER_URL': 'trunk.transport.Transport://localhost/trunk',})

@celery.taskdef add(x, y): return x + y

Trunkfrom celery import Celery

celery = Celery('tasks')celery.config_from_object({ 'BROKER_URL': 'trunk.transport.Transport://localhost/trunk',})

@celery.taskdef add(x, y): return x + y

T3( S%+r0)

Searching Text

Searching Text

Lucene

Sphinx

Elastic Search

Solr

Searching Text

Lucene

Sphinx

Elastic Search

SolrPostgres

Full Text SearchCREATE TABLE posts ( id serial, title varchar(255), content text, tags varchar(255)[], post_text tsvector);

CREATE INDEX posttext_gin ON posts USING GIN(post_text);

CREATE TRIGGER update_posttext BEFORE INSERT OR UPDATE ON postsFOR EACH ROW EXECUTE PROCEDUREtsvector_update_trigger( ‘PostText’,‘english’,title, content, tags);

Full Text SearchCREATE TABLE posts ( id serial, title varchar(255), content text, tags varchar(255)[], post_text tsvector);

CREATE INDEX posttext_gin ON posts USING GIN(post_text);

CREATE TRIGGER update_posttext BEFORE INSERT OR UPDATE ON postsFOR EACH ROW EXECUTE PROCEDUREtsvector_update_trigger( ‘PostText’,‘english’,title, content, tags);

Djangofrom djorm_pgfulltext.models import SearchManagerfrom djorm_pgfulltext.fields import VectorFieldfrom django.db import models

class Posts(models.Model): title = models.CharField(max_length=200) content = models.TextField()

search_index = VectorField()

objects = SearchManager( fields = ('title', 'content'), config = 'pg_catalog.english', search_field = 'search_index', auto_update_search_field = True )

Djangofrom djorm_pgfulltext.models import SearchManagerfrom djorm_pgfulltext.fields import VectorFieldfrom django.db import models

class Posts(models.Model): title = models.CharField(max_length=200) content = models.TextField()

search_index = VectorField()

objects = SearchManager( fields = ('title', 'content'), config = 'pg_catalog.english', search_field = 'search_index', auto_update_search_field = True )

Django

Post.objects.search("documentation & about")

Post.objects.search("about | documentation")

Django

Post.objects.search("documentation & about")

Post.objects.search("about | documentation")

I#'3%.

IndexesB-TreeGeneralized Inverted Index (GIN)Generalized Search Tree (GIST)K Nearest Neighbors (KNN)Space Partitioned GIST (SP-GIST)

Indexes

Which do I use?

IndexesB-TreeGeneralized Inverted Index (GIN)Generalized Search Tree (GIST)K Nearest Neighbors (KNN)Space Partitioned GIST (SP-GIST)

Generalized Inverted Index (GIN)

Use with multiple values in 1 columnArray/hStore

Generalized Search Tree (GIST)Full text searchShapesPostGIS

IndexesB-TreeGeneralized Inverted Index (GIN)Generalized Search Tree (GIST)K Nearest Neighbors (KNN)Space Partitioned GIST (SP-GIST)

G%!Sp+("+,

O#% M!r%

Connections

Connections

django-postgrespooldjorm-ext-pooldjango-db-pool

django-postgrespoolimport dj_database_urlimport django_postgrespool

DATABASE = { 'default': dj_database_url.config() }DATABASES['default']['ENGINE'] = 'django_postgrespool'

SOUTH_DATABASE_ADAPTERS = { 'default': 'south.db.postgresql_psycopg2'}

django-postgrespoolimport dj_database_urlimport django_postgrespool

DATABASE = { 'default': dj_database_url.config() }DATABASES['default']['ENGINE'] = 'django_postgrespool'

SOUTH_DATABASE_ADAPTERS = { 'default': 'south.db.postgresql_psycopg2'}

Limitations?Django attempts to support as many features as possible on all database backends. However, not all database backends are alike, and we’ve had to make design decisions on which features to support and which assumptions we can make safely.

P/($r%.

Its great

D*+#$! ORM

Its not so bad

5+#1.!