On Rabbits and Elephants

37
Gavin M. Roy @crad - pgConf.eu 2011 Data Distribution with pg_amqp and RabbitMQ On Rabbits and Elephants Gavin M. Roy @crad - pgConf.eu 2011

description

Given at pgConf.eu 2011 in Amsterdam

Transcript of On Rabbits and Elephants

Page 1: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Data Distribution with pg_amqp and RabbitMQ

On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Page 2: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

myYearbook.com• #1 Comscore Teen Destination, Top Trafficked Sites in

US who is hiring

• Use of PostgreSQL predates my joining as CTO in 2007 and we’re hiring

• Use of RabbitMQ goes back to 2009, we were hiring then too

• Awesome place to work, have been known to sponsor Europeans for work in the US and we’re hiring

Page 3: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Don't try this at home.(try it at work or at least on your laptop)

Page 4: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Audience Participation

• You’ll need PostgreSQL 9.1rc1 with pgbench installed

• Python 2.6 or Python 2.7

• Tarball from http://192.168.103.135/~gmr/

• Untar and then ./setup-client.py

Page 5: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Use Cases• Adding complexity to your environment

• Ensuring data inconsistency between canonical data sources and data warehouses

• Impressing your hipster Ruby programmer friends

• Replace your Slony replication setup

Page 6: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Use Cases• Write distributed map-reduce functions in

Erlang

Page 7: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

What is PostgreSQL?(just kidding)

Page 8: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

What is RabbitMQ?(don’t let the cloud based marketing deter you)

Page 9: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

About AMPQ• Platform, Vendor Neutral Messaging format

• 0.9.1 finalized in November 2008

• JP Morgan Chase, Red Hat, Rabbit Technologies, iMatrix, IONA Technologies, Cisco Systems, Envoy Technologies, Twist Process Innovations

• Members now include Bank of America, Barclays Banks, Microsoft Corporation, Novell, VMWare

• 1.0 Final: October 2011 “Changes some stuff”

Page 10: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Going to only scratch the surface

RabbitMQ & AMQPConcepts

Page 11: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Management Plugin

Page 12: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Publishers

Page 13: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Message Broker

Page 14: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Consumers

Page 15: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Message RoutingIt’s not just about Queues.

Page 16: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Routing KeyHelps Determine Message Destination

(in most cases)

Page 17: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Queues• Durability: Queue survives reboot

• Auto-Delete: Delete the queue when consumer goes away

• Exclusive: Only one consumer may consume

• x-expires: Auto-Delete after given duration

• x-message-ttl: Auto-discard a message after ttl

Page 18: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Declaring a Queue

Page 19: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Exchanges(The ones we care about for this talk)

Page 20: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Direct exchanges

• 1:1 Message Delivery Pattern

• Routing Keys are direct, no wildcarding

• All queues bound to a routing key get the message

Page 21: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Topic exchanges• Pattern Matching in Routing Keys

• Publish one message type to public.pgbench_accounts another public.pgbench_history

• Period delimiter, * stays within the period scope, # does not

• Example patterns: *.pgbench_accounts, public.*

Page 22: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Other Exchange Types• Built-In: Fanout, Headers

• 3rd Party or Plugin Types:

• Consistent Hash Exchange: Round-robin distribution to queues

• External: RPC Exchange Plugin

• Riak

• Last-Value Cache

• Recent History Exchange: Keeps last 20 messages routed

• Global Fanout: Sends to all queues, regardless of routing key

Page 23: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

One More thing...Exchange-to-Exchange Bindings

Page 24: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Messages• Have client specified properties:

• Content type, Encoding, Timestamp, App-Id, User-Id, Headers

• Delivery Mode:

• 1: Non-Persistent

• 2: Persistent

Page 25: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Step-By-Step Instructions1. Install your Erlang on the machine you into intend to run your RabbitMQ broker on. My preference is for Erlang R14b2, which almost makes me think they name Erlang releases after Star Wars characters, but they

don't. Erlang is not as fun as Debian is. It doesn't need to be a beefy machine but it should have enough ram that if you don't have an active consumer it won't run out of memory.

2. Install RabbitMQ. This is a multistep process involving black magic. Depending on your distribution you can either download a package, or if your *nix impaired there is a windows installer too. Oh and don't forget to install the management plugin. You can download the management plugin from the RqbbitMQ site pretty easy. After you have downloaded it you put it in the RabbitMQ plugins directory. Kind of makes sense doesn't it?

3. Go to the store and buy RabbitMQ in Action and RabbitMQ in Detail. What do you mean you can't? No pre-order either? Well there are plenty of good blogs on the intarwebs that have good information such as *cough* mine. Seriously though if you are crazy enough to follow my advice and get this far out of actual interest in the subject, make sure you verse yourself in the operation of and the flexibility that is RabbitMQ.

4. Install one of the following: pl/Python, pl/Perl, pl/Ruby, pl/PHP, pl/PGSQL and pg_amqp, or write your own damn addon in c using librabbitmq and then distribute it using the very cool pgxn service. Buoy only need this on the databases your going to be writing triggers on. Are you still reading this? Seriously? I would have thought I had lost you by now. Either that or you are skipping ahead because I should have changed the slide by now. It could be that Selena is operating the slides, in which case I have no control over these slides which is a terrifying thought. You see there is very little content on the slides, a real lack of substance. If I stay on any one slide for too long then you'll realize I really don't know what I am talking about and then I'd not be asked back to give one of the most important talks of the conference.

5. It's almost time to write your publisher, but don't get too far ahead of yourself. You first need to make sure that you have figured out a routing paradigm to use. Don't you like that word? Paradigm. I hear it in my head as para dig'um. Anyway, back to routing paradigms. Depending on your use case you will have the ability to broadcast your events to a whole slew of databases that care about what you have to send, or just one. You can implement a 1-to-1 type of scenario where a "direct" exchange is setup and your data will be queued in a way that no message is duplicated to your client and there is transactional guarantees in RabbitMQ. You probably don't want that thou, as you're a fly -by-the-seat-of-your-pants type of person, aren't you? For that you'll want to turn on noAck mode in your consumers. Rabbit will just fling your messages to your consumers as fast as possible, just like an ape at a zoo and his favorite projectile.

6. This is where things get fun,. We are going to quite. Stored procedure in the language of your choice and have it publish to RabbitMQ using an exchange and a routing key, now at this point if you are still reading this, I have to question your sanity. This was a one off gag slide to make people at the conference think that I wax going to read all of this. Anyway, wirte your function that publishes and you are almost done. No seriously. Well not quite done because you still need the part that calls this function and then the consumer app that reads the data from RabbitMQ and then doe the actions in PostgreSQL that you want it to do.

7. Now, basically you're going to add after insert, update, delete triggers to the tables you care about. I'm. Not going to go into detail about how to do that since you are at a PostgreSQL. Conference and should know how to do that already. Instead I am going to assume you know what I am writing about and say that in these triggers you will want to call the function we wrote in step 6 from these trigger events,

8. Go to http://github.com/gmr/On-Rabbits-and-Elephants/ and download the sample code and use that instead of all of this nonsense.

Page 26: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

pg_amqp• Is a publisher only PostgreSQL extension

• Transactional if you use amqp.publish

• Not if you use amqp.autonomous_publish

• Needs some <3 for more advanced features like delivery mode changing and message properties.

Page 27: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Use a Trigger(benefit from the value of a trigger?)

Page 28: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Not that kind of trigger.

Page 29: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Maybe this kind.

Page 30: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Really this kind.(pretty boring, I know)

Page 31: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Stop. Demo Time.

Page 32: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Demo• Using pgbench

• PostgreSQL 9.1rc1

• pg_amqp and plpythonu

• RabbitMQ 2.6.1

• Python 2.7 (Should work in 2.5 or 2.6 too)

Page 33: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Generic Trigger

• Publishes to schema.table_name

• Wraps the entire trigger data set into a JSON payload

Page 34: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

consumer.pyDynamically builds SQL based upon event type and

executes on a remote PostgreSQL database

./consumer.py 192.168.103.135

Page 35: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Proof!./watch-messages.py 192.168.103.135

./watch-db-client.py

Page 36: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Interactive Demo

Page 37: On Rabbits and Elephants

Gavin M. Roy @crad - pgConf.eu 2011

Resources• rabbitmq:

http://www.rabbitmq.com

• pg_amqp:easy_install pgxnclientUSE_PGXS=1 pgxn install pg_amqphttp://pgxn.org/dist/pg_amqp/

• example code:https://github.com/gmr/On-Rabbits-and-ElephantspgConf.eu Branch

• follow me on twitter: @crad

Please give feedback:

http://2011.pgconf.eu/feedback