Concurrent Programming with Ruby and Tuple Spaces

download Concurrent Programming with Ruby and Tuple Spaces

If you can't read please download the document

Transcript of Concurrent Programming with Ruby and Tuple Spaces

Concurrent Programming with Ruby and Tuple Spaces

Luc CasteraFounder / messagepub.com

The Free Lunch is Over:
A Fundamental Turn Toward Concurrency in Software

Source: http://www.gotw.ca/publications/concurrency-ddj.htm

The major processor manufacturers and architectures have run out of room with most of their traditional approaches to boosting CPU performance. Instead of driving clock speeds ands straight-line instruction throughput ever higher, they are instead turning en masse to hyperthreading and multicore architectures. [] And that puts us at a fundamental turning point in software development, at least for the next few years... Herb Sutter March 2005

Outline

1. The problem with Ruby Threads

2. Multiple Ruby Processes

3. Inter-process Communication with TupleSpaces

PART 1



The Problem With Threads

A closer look at the Ruby threading model

3 Types of Threading Models:

1 : N
1 : 1
M : N

3 Types of Threading Models:

1 : N
1 : 1
M : N

Kernel Threads

User-Space Threads

1 : N Green Threads

One kernel thread for N user threads

aka lightweight threads

10 ms

10 ms

10 ms

10 ms

10 ms

10 ms

10 ms

10 ms

RUBY 1.8

Pros and Cons

Pros:Thread creation, execution, and cleanup are cheap

Lots of threads can be created

Cons:Not really parallel because kernel scheduler doesn't know about threads and can't schedule them across CPUs or take advantage of SMP

Blocking I/O operation can block all green threadsExample: C Extension

Example: mysql gem (solution: NeverBlock mysqlplus)

blocking

1 : 1 Native Threads

1 kernel thread for each user thread

Pros and Cons

Pros:Threads can execute on different CPUs (truly parallel)

Threads do not block each other

Cons:Setup Overhead

Low limit on number of threads

Linux kernel bug with lots of threads

RUBY 1.9

I lied.

Global Interpreter Lock

A Global Interpreter Lock (GIL) is a mutual exclusion lock held by a programming language interpreter thread to avoid sharing code that is not thread-safe with other threads. There is always one GIL for one interpreter process.

Usage of a Global Interpreter Lock in a language effectively limits concurrency of a single interpreter process with multiple threads there is no or very little increase in speed when running the process on a multiprocessor machine.

Source: Wikipedia

A person (male or female) who intentionally or unintentionally stops the progress of two others getting their game on.

Concurrency is a myth in Ruby

Ilya Grigorik

Ilya Gregorik:

The implications of the GIL are surprising at first, but it turns out the solution to this problem is not all that complex: instead of thinking in threads, think how you could split the workload between different processes. Not only will you bypass an entire class of problems associated with concurrent programming (it's hard!), but you are also much more likely to end up with a horizontally scalable architecture for your application. Here are the steps:

1. Partition the work, or decompose your application 2. Add a communications / work queue (Starling, Beanstalkd, RabbitMQ) 3. Fork, or run multiple instances of you application

Not surprisingly, many of the Ruby applications have already adopted this strategy: a typical Rails deployments is powered by a cluster of app servers (Mongrel, Ebb, Thin), and alternative strategies like EventMachine, and Revactor (equivalents of Twisted in Python) are gaining ground as a simple way to defer and parallelize your network IO without introducing threads into your application.

Unless you are using JRuby.

A note on Fibers

Ruby 1.9 introduces fibers.

Fibers are green threads, but scheduling must be done by the programmer and not the VM.

Faster and cheaper then native threads.

Implemented for Ruby 1.8 by Aman Gupta.

Learn More:http://tinyurl.com/rubyfibers

http://all-thing.net/fibers

http://all-thing.net/fibers-via-continuations

M : N Hybrid Model

M kernel threads for N user threads

best of both worlds

Pros and Cons

Pros:Take advantage of multiple CPUs

Not all threads are blocked by blocking system calls

Cheap creation, execution, and cleanup

Cons:Need scheduler in userland and kernel to work with each other

Green threads doing blocking I/O operations will block all other green threads sharing same kernel thread

Difficult to write, maintain, and debug code

Writing multi-threaded code is really, really hard. And it is hard because of Shared Memory. Jim Weirich

The Other Problem with Threads

http://rubyconf2008.confreaks.com/what-all-rubyist-should-know-about-threads.html

Multi-Threaded Code is Hard+Concurrency is a myth =FAIL!

Stop thinking in threads

Design your application to use multiple processes

The implications of the GIL are surprising at first, but it turns out the solution to this problem is not all that complex: instead of thinking in threads, think how you could split the workload between different processes. Not only will you bypass an entire class of problems associated with concurrent programming (it's hard!), but you are also much more likely to end up with a horizontally scalable architecture for your application. Here are the steps:

1. Partition the work, or decompose your application 2. Add a communications / work queue (Starling, Beanstalkd, RabbitMQ) 3. Fork, or run multiple instances of you application

Not surprisingly, many of the Ruby applications have already adopted this strategy: a typical Rails deployments is powered by a cluster of app servers (Mongrel, Ebb, Thin), and alternative strategies like EventMachine, and Revactor (equivalents of Twisted in Python) are gaining ground as a simple way to defer and parallelize your network IO without introducing threads into your application.

PART 2



Multiple Ruby Processes

Pros and Cons

Pros:No longer sharing memory

Take advantage of multiple CPUs (Performance)

Not all threads are blocked by blocking system calls.

Scalability

Fault-Tolerance

Cons:Process creation, execution and cleanup is expensive

Uses a lot of memory (loading Ruby VM for every process)

Need a way for processes to communicate!

Latency

Starting/Stopping

Fault-Tolerance

Monitoring

but we will focus on...

How do the processes communicate?

Options

DRB

Sockets

QueuesRabbitMQ

ActiveMQ

Key-Value DatabasesRedis

Tokyo Cabinet

Memcached

Relational Databases

XMPP

TupleSpaces

Examples

Rails + Mongrel/Thin

Cluster of application servers (Mongrel, Thin...)

Communication between processes is done via the database.

Nanite

A self-assembling fabric of Ruby daemons

http://github.com/ezmobius/nanite

Uses RabbitMQ/AMQP for IPC

Revactor

Uses the actor model

Actors are kinda like threads, with messaging baked-in.

Each Actor has a mailbox.

It's like coding erlang in Ruby.

Messages are passed between actors using TCP sockets.

Good Documentation

http://revactor.org/

Erlang provides a sledgehammer for the problems of concurrent programming. But, sometimes you don't need a sledgehammer... just a flyswatter will do. Tony Arcieri

Discontinued for Reia

Journeta

Journeta is a dirt simple library for peer discovery and message passing between Ruby applications on a LAN

Uses UDP Sockets for IPC

Uses the fucked up Ruby socket API from their RDOC

Demo(?)

If time permits, show demo.

PART 3


TupleSpaces

Interprocess Communication with TupleSpaces

A tuple space provides a repository of tuples that can be accessed concurrently.

[:add, 1, 2]

[:result, 79]

[:add, 60, 5]

[:token]

[:search, linda]

[:where_is, :waldo

[:subtract, 10, 2]

[:save, 7864]

The Blackboard Metaphor

[:add, 1, 2]

[:result, 79]

[:add, 60, 5]

[:token]

[:search, linda]

[:where_is, :waldo

[:subtract, 10, 2]

[:save, 7864]

The Blackboard Metaphor

[:add, nil, nil]

[:add, 1, 2]

[:result, 79]

[:add, 60, 5]

[:token]

[:search, linda]

[:where_is, :waldo

[:subtract, 10, 2]

[:save, 7864]

The Blackboard Metaphor

[nil]

[:add, 1, 2]

[:result, 79]

[:add, 60, 5]

[:token]

[:search, linda]

[:where_is, :waldo

[:subtract, 10, 2]

[:save, 7864]

The Blackboard Metaphor

[:where_is, :waldo]

About Tuple Spaces

First implementation was Linda.

Linda was developed by David Gelernter and Nicholas Carriero at Yale University.

Implementations exists for most languages.

The Ruby implementation is Rinda.

Rinda is a built-in library, so no need to install.

5 Basic Operations

read

read_all

write

take

notify

5 Basic Operations

read

read_all

write

take

notify

Reads tuple, but does not remove it.

Blocking, by default, but takes an additional timeout argument.

5 Basic Operations

read

read_all

write

take

notify

Returns all tuples matching tuple. Does not remove the found tuples.

5 Basic Operations

read

read_all

write

take

notify

Adds Tuple

Takes an optional timeout parameter

5 Basic Operations

read

read_all

write

take

notify

Atomic Read + Delete

Blocking, by default, but takes an additional timeout argument.

5 Basic Operations

read

read_all

write

take

notify

Registers for notifications of events: Write

Take

Delete

Key Features

Spaces are sharedSpace handles details of concurrent access

Spaces are persistentIf agent process dies, data is still in space

However, if space process dies, data is lost (?)

Spaces are associativeAssociative lookups rather than memory location or identifier

Spaces are transactionally secureAtomic Operations

Spaces allow us to exchange executable content

A Rinda tuple can be an array or a hash

A Rinda tuple can be an array or a hash

( But let's stick with the array, I like that better! )

Start a Tuple Space on port 1234

Clients/Agents

DEMO
Rinda

RingServer

This is also a TupleSpace

SPOF

Rinda is not persistent...

If it crashes while you have tuples in the space, you lose them all.

Only Ruby

Introducing Blackboard

TupleSpace implementation on top of Redis Persistent

Redis is a really fast key-value database.Like memcached but data is not volatile.

Same API Plug & Play

For now, only supports: take, read, and write

http://github.com/dambalah/blackboard

Server

Just start the redis-server:

$ redis-server

Client/Agents

DEMO
Blackboard

Blackboard Benchmarks

Blackboard: Future

Move from Redis to a custom based Erlang blackboard implementation.

I would like that Erlang implementation to be easily used from other programming languages also.

So it's really two projects:Blackboard in erlang

Ruby-library to talk to blackboard in erlang

Thank you!

Luc CasteraFounder / messagepub.com

Questions?Feedback?

[email protected]

www.speakerrate.com

Luc CasteraFounder / messagepub.com

Resources / References

Part 1: Threading Modelshttp://timetobleed.com/threading-models-so-many-different-ways-to-get-stuff-done/

http://envycasts.com/products/scaling-ruby

http://www.infoq.com/news/2007/05/ruby-threading-futures

http://thebogles.com/blog/2006/11/ruby-threading/

http://spec.ruby-doc.org/wiki/Ruby_Threading

http://www.bitwiese.de/2007/09/on-processes-and-threads.html

http://www.igvita.com/2008/11/13/concurrency-is-a-myth-in-ruby/

http://bartoszmilewski.wordpress.com/2008/08/24/threads-dont-scale-processes-do/

http://en.wikipedia.org/wiki/Global_Interpreter_Lock

http://www.gotw.ca/publications/concurrency-ddj.htm

http://tinyurl.com/rubyfibers

Resources / References

Part 2: Multiple Processeshttp://github.com/ezmobius/nanite

http://erlang.org/

http://www.rabbitmq.com/

http://code.google.com/p/redis/

http://revactor.org/

http://journeta.rubyforge.org/

http://home.mindspring.com/~eric_rollins/ParallelRuby.html

Resources / References

Part 3: TupleSpaceshttp://c2.com/cgi/wiki?TupleSpace

http://en.wikipedia.org/wiki/Tuplespace

http://www.julianbrowne.com/article/viewer/space-based-architecture-example

http://www.rubyagent.com/

http://segment7.net/projects/ruby/drb/

http://segment7.net/projects/ruby/drb/rinda/ringserver.html

JavaSpaces Principles, Patterns, and Practice Freeman, Hupfer, et. al.

http://www.ruby-doc.org/stdlib/libdoc/rinda/rdoc/index.html

Things I wish I had time to spend on

MPI and Ruby-MPIhttp://github.com/abedra/mpi-ruby/tree/master

Ruby forkoff:http://tinyurl.com/forkoff

RindaBlackboard

Write (1000)0.0427490.253068

Take (500)0.08274415.844250

Read (500)0.02009820.098478

???Page ??? (???)06/10/2009, 20:34:24Page /