Download - Unlocking Proprietary Data with PostgreSQL Foreign Data Wrappers

Transcript
Page 1: Unlocking Proprietary Data with PostgreSQL Foreign Data Wrappers

Unlocking Proprietary Data with PostgreSQL Foreign Data Wrappers

Pat PattersonPrincipal Developer Evangelist

[email protected]@metadaddy

Page 2: Unlocking Proprietary Data with PostgreSQL Foreign Data Wrappers

Agenda

Foreign Data Wrappers

Writing FDW’s in C

Multicorn

Database.com FDW for PostgreSQL

FDW in action

Page 3: Unlocking Proprietary Data with PostgreSQL Foreign Data Wrappers

Why Foreign Data Wrappers?

External data sources look like local tables!– Other SQL database

• MySQL, Oracle, SQL Server, etc

– NoSQL database• CouchDB, Redis, etc

– File

– LDAP

– Web services• Twitter!

Page 4: Unlocking Proprietary Data with PostgreSQL Foreign Data Wrappers

Why Foreign Data Wrappers?

Make the database do the work– SELECT syntax

• DISTINCT, ORDER BY etc

– Functions• COUNT(), MIN(), MAX() etc

– JOIN external data to internal tables

– Use standard apps, libraries for data analysis,

reporting

Page 5: Unlocking Proprietary Data with PostgreSQL Foreign Data Wrappers

Foreign Data Wrappers

2003 - SQL Management of External Data (SQL/MED)

2011 – PostgreSQL 9.1 implementation– Read-only

– SELECT-clause optimization

– WHERE-clause push-down• Minimize data requested from external source

Future Improvements– JOIN push-down

• Where two foreign tables are in the same server

– Support cursors

Page 6: Unlocking Proprietary Data with PostgreSQL Foreign Data Wrappers

FDW’s in PostgreSQL

‘Compiled language’ (C) interface

Implement a set of callbackstypedef struct FdwRoutine{ NodeTag type; /* These functions are required. */ GetForeignRelSize_function GetForeignRelSize; GetForeignPaths_function GetForeignPaths; GetForeignPlan_function GetForeignPlan; ExplainForeignScan_function ExplainForeignScan; BeginForeignScan_function BeginForeignScan; IterateForeignScan_function IterateForeignScan; ReScanForeignScan_function ReScanForeignScan; EndForeignScan_function EndForeignScan; /* These functions are optional. */ AnalyzeForeignTable_function AnalyzeForeignTable;} FdwRoutine;

Page 7: Unlocking Proprietary Data with PostgreSQL Foreign Data Wrappers

FDW’s in PostgreSQL

Much work!• CouchDB FDW

• https://github.com/ZhengYang/couchdb_fdw/

• couchdb_fdw.c > 1700 LoC

Page 8: Unlocking Proprietary Data with PostgreSQL Foreign Data Wrappers

Multicorn

http://multicorn.org/

PostgreSQL 9.1+ extension

Python framework for FDW’s

Implement two methods…

Page 9: Unlocking Proprietary Data with PostgreSQL Foreign Data Wrappers

Multicorn

from multicorn import ForeignDataWrapper

class ConstantForeignDataWrapper(ForeignDataWrapper):

def __init__(self, options, columns): super(ConstantForeignDataWrapper, self).__init__(options, columns) self.columns = columns

def execute(self, quals, columns): for index in range(20): line = {} for column_name in self.columns: line[column_name] = '%s %s' % (column_name, index) yield line

Page 10: Unlocking Proprietary Data with PostgreSQL Foreign Data Wrappers

Database.com FDW for PostgreSQL

OAuth login to Database.com / Force.com– Refresh on token expiry

Force.com REST API– SOQL query

• SELECT firstname, lastname FROM Contact

Request thread puts records in Queue, execute()

method gets them from Queue

JSON parsing – skip embedded metadat

< 250 lines code

Page 11: Unlocking Proprietary Data with PostgreSQL Foreign Data Wrappers

Demo

Page 12: Unlocking Proprietary Data with PostgreSQL Foreign Data Wrappers

Conclusion

Foreign Data Wrappers make the whole world look like

tables!

Writing FDW’s in C is hard!– Or, at least, time consuming!

Writing FDW’s in Python via Multicorn is easy!– Or, at least, quick!

Try it for yourself!

Page 14: Unlocking Proprietary Data with PostgreSQL Foreign Data Wrappers