Relational Database Access with Python ‘sans’ ORM

29
Relational Database Access with Python ‘sans’ ORM Mark Rees CTO Century Software (M) Sdn. Bhd.

description

Slides from my PyCon APAC 2012 talk in Singapore

Transcript of Relational Database Access with Python ‘sans’ ORM

Page 1: Relational Database Access with Python ‘sans’ ORM

Relational Database Access with Python ‘sans’ ORM

Mark ReesCTO

Century Software (M) Sdn. Bhd.

Page 2: Relational Database Access with Python ‘sans’ ORM

Your Current Relational Database Access Style?

# Django ORM>>> from ip2country.models import Ip2Country

>>> Ip2Country.objects.all()[<Ip2Country: Ip2Country object>, <Ip2Country: Ip2Country object>, '...(remaining elements truncated)...']

>>> sgp = Ip2Country.objects.filter(assigned__year=2012)\... .filter(countrycode2='SG')

>>> sgp[0].ipfrom1729580032.0

Page 3: Relational Database Access with Python ‘sans’ ORM

Your Current Relational Database Access Style?

# SQLAlchemy ORM>>> from sqlalchemy import create_engine, extract>>> from sqlalchemy.orm import sessionmaker>>> from models import Ip2Country

>>> engine = create_engine('postgresql://ip2country_rw:secret@localhost/ip2country')>>> Session = sessionmaker(bind=engine)>>> session = Session()

>>> all_data = session.query(Ip2Country).all()

>>> sgp = session.query(Ip2Country).\... filter(extract('year',Ip2Country.assigned) == 2012).\... filter(Ip2Country.countrycode2 == 'SG')

print sgp[0].ipfrom1729580032.0

Page 4: Relational Database Access with Python ‘sans’ ORM

SQL Relational Database Access

SELECT * FROM ip2country;

"ipfrom";"ipto";"registry";"assigned";"countrycode2";"countrycode3";"countryname"1729522688;1729523711;"apnic";"2011-08-05";"CN";"CHN";"China"1729523712;1729524735;"apnic";"2011-08-05";"CN";"CHN";"China”. . .

SELECT * FROM ip2countryWHERE date_part('year', assigned) = 2012AND countrycode2 = 'SG';

"ipfrom";"ipto";"registry";"assigned";"countrycode2";"countrycode3";"countryname"1729580032;1729581055;"apnic";"2012-01-16";"SG";"SGP";"Singapore"1729941504;1729942527;"apnic";"2012-01-10";"SG";"SGP";"Singapore”. . .

SELECT ipfrom FROM ip2countryWHERE date_part('year', assigned) = 2012AND countrycode2 = 'SG';

"ipfrom"17295800321729941504. . .

Page 5: Relational Database Access with Python ‘sans’ ORM

Python + SQL == Python DB-API 2.0

• The Python standard for a consistent interface to relational databases is the Python DB-API (PEP 249)

• The majority of Python database interfaces adhere to this standard

Page 6: Relational Database Access with Python ‘sans’ ORM

Python DB-API UML Diagram

Page 7: Relational Database Access with Python ‘sans’ ORM

Python DB-API Connection Object

Access the database via the connection object• Use connect constructor to create a

connection with databaseconn = psycopg2.connect(parameters…)

• Create cursor via the connectioncur = conn.cursor()

• Transaction management (implicit begin)conn.commit()conn.rollback()

• Close connection (will rollback current transaction)

conn.close()• Check module capabilities by globals

psycopg2.apilevel psycopg2.threadsafety psycopg2.paramstyle

Page 8: Relational Database Access with Python ‘sans’ ORM

Python DB-API Cursor Object

A cursor object is used to represent a database cursor, which is used to manage the context of fetch operations.• Cursors created from the same connection

are not isolatedcur = conn.cursor()cur2 = conn.cursor()

• Cursor methodscur.execute(operation, parameters) cur.executemany(op,seq_of_parameters)cur.fetchone()cur.fetchmany([size=cursor.arraysize])cur.fetchall()cur.close()

Page 9: Relational Database Access with Python ‘sans’ ORM

Python DB-API Cursor Object

• Optional cursor methodscur.scroll(value[,mode='relative']) cur.next()cur.callproc(procname[,parameters])cur.__iter__()

• Results of an operationcur.descriptioncur.rowcountcur.lastrowid

• DB adaptor specific “proprietary” cursor methods

Page 10: Relational Database Access with Python ‘sans’ ORM

Python DB-API Parameter Styles

Allows you to keep SQL separate from parameters

Improves performance & security

Warning Never, never, NEVER use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string. Not even at gunpoint.

From http://initd.org/psycopg/docs/usage.html#query-parameters

Page 11: Relational Database Access with Python ‘sans’ ORM

Python DB-API Parameter Styles

Global paramstyle gives supported style for the adaptor

qmark Question mark styleWHERE countrycode2 = ?

numeric Numeric positional styleWHERE countrycode2 = :1

named Named styleWHERE countrycode2 = :code

format ANSI C printf format styleWHERE countrycode2 = %s

pyformat Python format style WHERE countrycode2 = %(name)s

Page 12: Relational Database Access with Python ‘sans’ ORM

Python + SQL: INSERTimport csv, datetime, psycopg2conn = psycopg2.connect("dbname=ip2country user=ip2country_rw password=secret”)cur = conn.cursor()with open("IpToCountry.csv", "rb") as f: reader = csv.reader(f) try: for row in reader: print row if row[0][0] != "#": row[3] = datetime.datetime.utcfromtimestamp(float(row[3])) cur.execute("""INSERT INTO ip2country( ipfrom, ipto, registry, assigned, countrycode2, countrycode3, countryname) VALUES (%s, %s, %s, %s, %s, %s, %s)""", row) except: conn.rollback() else: conn.commit() finally: cur.close() conn.close()

Page 13: Relational Database Access with Python ‘sans’ ORM

Python + SQL: SELECT# Find ipv4 address ranges assigned to Singaporeimport psycopg2, socket, struct

def num_to_dotted_quad(n): """convert long int to dotted quad string http://code.activestate.com/recipes/66517/""" return socket.inet_ntoa(struct.pack('!L',n))

conn = psycopg2.connect("dbname=ip2country user=ip2country_rw password=secret")

cur = conn.cursor()

cur.execute("""SELECT * FROM ip2country WHERE countrycode2 = 'SG' ORDER BY ipfrom""")

for row in cur: print "%s - %s" % (num_to_dotted_quad(int(row[0])), num_to_dotted_quad(int(row[1])))

Page 14: Relational Database Access with Python ‘sans’ ORM

SQLite

• sqlite3• CPython 2.5 & 3• DB-API 2.0• Part of CPython distribution since 2.5

Page 15: Relational Database Access with Python ‘sans’ ORM

PostgreSQL

• psycopg• CPython 2 & 3• DB-API 2.0, level 2 thread safe• Appears to be most popular• http://initd.org/psycopg/

• py-postgresql• CPython 3• DB-API 2.0• Written in Python with optional C

optimizations• pg_python - console• http://python.projects.postgresql.org/

Page 16: Relational Database Access with Python ‘sans’ ORM

PostgreSQL

• PyGreSQL• CPython 2.3+• Classic & DB-API 2.0 interfaces• http://www.pygresql.org/• Last release 2009

• pyPgSQL• CPython 2• Classic & DB-API 2.0 interfaces• http://www.pygresql.org/• Last release 2006

Page 17: Relational Database Access with Python ‘sans’ ORM

PostgreSQL

• pypq• CPython 2.7 & pypy 1.7+• Uses ctypes• DB-API 2.0 interface• psycopg2-like extension API• https://bitbucket.org/descent/pypq

• psycopg2ct• CPython 2.6+ & pypy 1.6+• Uses ctypes• DB-API 2.0 interface• psycopg2 compat layer • http://github.com/mvantellingen/

psycopg2-ctypes

Page 18: Relational Database Access with Python ‘sans’ ORM

MySQL

• MySQL-python• CPython 2.3+• DB-API 2.0 interface• http://sourceforge.net/projects/mysql-

python/• PyMySQL• CPython 2.4+ & 3• Pure Python DB-API 2.0 interface• http://www.pymysql.org/

• MySQL-Connector• CPython 2.4+ & 3• Pure Python DB-API 2.0 interface• https://launchpad.net/myconnpy

Page 19: Relational Database Access with Python ‘sans’ ORM

Other “Enterprise” Databases

• cx_Oracle• CPython 2 & 3• DB-API 2.0 interface• http://cx-oracle.sourceforge.net/

• informixda• CPython 2• DB-API 2.0 interface• http://informixdb.sourceforge.net/• Last release 2007

• Ibm-db• CPython 2• DB-API 2.0 for DB2 & Informix• http://code.google.com/p/ibm-db/

Page 20: Relational Database Access with Python ‘sans’ ORM

ODBC

• mxODBC• CPython 2.3+• DB-API 2.0 interfaces• http://www.egenix.com/products/pytho

n/mxODBC/doc

• Commercial product

• PyODBC• CPython 2 & 3• DB-API 2.0 interfaces with extensions• http://code.google.com/p/pyodbc/

• ODBC interfaces not limited to Windows thanks to iODBC and unixODBC

Page 21: Relational Database Access with Python ‘sans’ ORM

Jython + SQL

• zxJDBC• DB-API 2.0 Written in Java using JDBC

API so can utilize JDBC drivers• Support for connection pools and JNDI

lookup• Included with standard Jython

installation http://www.jython.org/• jyjdbc• DB-API 2.0 compliant• Written in Python/Jython so can utilize

JDBC drivers• Decimal data type support• http://code.google.com/p/jyjdbc/

Page 22: Relational Database Access with Python ‘sans’ ORM

IronPython + SQL

• adodbapi• IronPython 2+• Also works with CPython 2.3+ with

pywin32• http://adodbapi.sourceforge.net/

Page 23: Relational Database Access with Python ‘sans’ ORM

Gerald, the half a schema

import geralds1 = gerald.PostgresSchema(’public', 'postgres://ip2country_rw:secret@localhost/ip2country')s2 = gerald.PostgresSchema(’public', 'postgres://ip2country_rw:secret@localhost/ip2countryv4')

print s1.schema['ip2country'].compare(s2.schema['ip2country'])DIFF: Definition of assigned is differentDIFF: Column countryname not in ip2countryDIFF: Definition of registry is differentDIFF: Column countrycode3 not in ip2countryDIFF: Definition of countrycode2 is different

• Database schema toolkit• via DB-API currently supports• PostgreSQL• MySQL• Oracle

• http://halfcooked.com/code/gerald/

Page 24: Relational Database Access with Python ‘sans’ ORM

SQLPython

$ sqlpython --postgresql ip2country ip2country_rwPassword: 0:ip2country_rw@ip2country> select * from ip2country where countrycode2='SG';...1728830464.0 1728830719.0 apnic 2011-11-02 SG SGP Singapore 551 rows selected.0:ip2country_rw@ip2country> select * from ip2country where countrycode2='SG'\j[...{"ipfrom": 1728830464.0, "ipto": 1728830719.0, "registry": "apnic”,"assigned": "2011-11-02", "countrycode2": "SG", "countrycode3": "SGP", "countryname": "Singapore"}]

• A command-line interface to relational databases• via DB-API currently supports• PostgreSQL• MySQL• Oracle

• http://packages.python.org/sqlpython/

Page 25: Relational Database Access with Python ‘sans’ ORM

SQLPython, batteries included0:ip2country_rw@ip2country> select * from ip2country where countrycode2 ='SG’;...1728830464.0 1728830719.0 apnic 2011-11-02 SG SGP Singapore 551 rows selected.0:ip2country_rw@ip2country> pyPython 2.6.6 (r266:84292, May 20 2011, 16:42:25) [GCC 4.4.5 20110214 (Red Hat 4.4.5-6)] on linux2

py <command>: Executes a Python command. py: Enters interactive Python mode. End with `Ctrl-D` (Unix) / `Ctrl-Z` (Windows), `quit()`, 'exit()`. Past SELECT results are exposed as list `r`; most recent resultset is `r[-1]`. SQL bind, substitution variables are exposed as `binds`, `substs`. Run python code from external files with ``run("filename.py")`` >>> r[-1][-1](1728830464.0, 1728830719.0, 'apnic', datetime.date(2011, 11, 2), 'SG', 'SGP', 'Singapore')>>> import socket, struct>>> def num_to_dotted_quad(n):... return socket.inet_ntoa(struct.pack('!L',n))...>>> num_to_dotted_quad(int(r[-1][-1].ipfrom))'103.11.220.0'

Page 26: Relational Database Access with Python ‘sans’ ORM

SpringPython – Database Templates# Find ipv4 address ranges assigned to Singapore# using SpringPython DatabaseTemplate & DictionaryRowMapper

from springpython.database.core import *from springpython.database.factory import * conn_factory = PgdbConnectionFactory( user="ip2country_rw", password="secret", host="localhost", database="ip2country")dt = DatabaseTemplate(conn_factory)

results = dt.query( "SELECT * FROM ip2country WHERE countrycode2=%s", ("SG",), DictionaryRowMapper())

for row in results: print "%s - %s" % (num_to_dotted_quad(int(row['ipfrom'])), num_to_dotted_quad(int(row['ipto'])))

Page 27: Relational Database Access with Python ‘sans’ ORM

DB-API 2.0 PEP http://www.python.org/dev/peps/pep-0249/

Travis Spencer’s DB-API UML Diagram http://travisspencer.com/

Andrew Kuchling's introduction to the DB-API http://www.amk.ca/python/writing/DB-API.html

Attributions

Page 28: Relational Database Access with Python ‘sans’ ORM

Andy Todd’s OSDC paper http://halfcooked.com/presentations/osdc2006/python_databases.html

Source of csv data used in examples from WebNet77 licensed under GPLv3 http://software77.net/geo-ip/

Attributions

Page 29: Relational Database Access with Python ‘sans’ ORM

Mark Reesmark at centurysoftware dot com dot my

+Mark Rees@hexdump42

hex-dump.blogspot.com

Contact Details