Polyglot parallelism

Post on 29-Jan-2018

1.227 views 0 download

Transcript of Polyglot parallelism

Polyglot ParallelismA Case Study in Using Erlang

and Ruby at Rackspace

The ProblemPart 1

20,000 network devices

9 Datacenters,3 Continents

devices not designed forhigh-throughput

management

we need a high throughput solution

the time spent in I/O is the primary

bottleneck

if you want to speed things up you have to talk to more devices

in parallel

The ProblemPart II

huge blobs of data

lots of backups equals big database

ad-hoc searching is difficult but

important

customer SLA means need to restore from

backup quickly

an event must be generated for each device interaction

migrations are problematic with that much data

rigid schema made adapting to new devices difficult

each device type has different properties

“backup” means different things for each device type

need to grow with the business

Previous Solution

multiple Ruby apps

difficult to scale

vendor device managers

New Solution

the simplest thing that could possibly

work

most db writes come from

scheduled jobs

Rails

MongoDB

Erlang

ReST API

Other Clients

NetworkDevices

Joe Armstrong

AXD301ATM Switch

99.9999999%

Functional

Dynamically Typed

Single Assignment

A = 1. %=> 1A = 2. %=> badmatch

[B, 2, C] = [1, 2, 3].B = 1. %=> 1C = 3. %=> 3

ImmutableData

Structures

D = dict:new().D1 = dict:store(foo, 1, D).D2 = dict:store(bar, 2, D1).

Concurrency Oriented

-module(fact).-export([fac/1]). fac(0) -> 1; fac(N) -> N * fac(N-1).

-module(quicksort).-export([quicksort/1]). quicksort([]) -> [];quicksort([Pivot|Rest]) -> quicksort([Front || Front <- Rest, Front < Pivot]) ++ [Pivot] ++ quicksort([Back || Back <- Rest, Back >= Pivot]).

Details

jobs framework

Runner

CallbackModuleWorkers

Runner Worker CallbackModule

start

ready

process processready

.

.

.stop

item

“behaviour” is interface

behaviour_info(callbacks) -> [ {init, 1}, {process_item, 3}, {worker_died, 5}, {job_stopping, 1}, {job_complete, 2}].

running({worker_ready, WorkerPid, ok}, S) -> case queue:out(S#state.items) of {empty, I2} -> stop_worker(WorkerPid, S), {next_state, complete, S#state{items = I2}};

{{value, Item}, I2} -> job_worker:process(WorkerPid, Item, now(), S#state.job_state),

{next_state, running, S#state{items = I2}} end;

handle_info({'DOWN', _, process, WorkerPid, Info}, StateName, S) -> {Item, StartTime} = clear_worker(WorkerPid, S),

Callback = S#state.callback, spawn(Callback, worker_died, [Item, WorkerPid, StartTime, Info, S#state.job_state]),

%% Start a replacement worker start_workers(1, Callback), {next_state, StateName, S};

handle_cast({process, Item, StartTime, JS}, S) -> Callback = S#state.callback, Continue = try Callback:process_item(Item, StartTime, JS) catch throw: Error -> error_logger:error_report(Error), ok end,

job_runner:worker_ready(S#state.runner, self(), Continue),

{noreply, S}.

story time

ReSTful APIwith Webmachine

The Convention Over Configuration Webserver

HTTP Request Lifecycle Diagram

http://webmachine.basho.com

Webmachine Is Simple As Proven by the “Number of Types of Things”

Measurement of Complexity

If you know HTTP

The 3 Most Important Types of Things In Webmachine

1. Dispatch Rules (pure data--barely a thing!)2. Resources (composed of simple functions!)3. Requests (simple get/set interface!)

Dispatch Rules

GET /devices/12345

Webmachine inspects the device_resource module for defined callbacks, and sets the Request record’s “server”

value to 12345.

{ ["devices", server], device_resource, [] }

Resources

• POEM (Plain Old Erlang Module)• Composed of referentially transparent functions*• Functions are callbacks into the request lifecycle• Approximately 30 possible callback functions, e.g.:

• resource_exists → 404 Not Found• is_authorized → 401 Not Authorized

* mostly

Perma-404resource_exists(Request, Context) -> {false, Request, Context}.

Lucky Authis_authorized(Request, Context) -> S = calendar:time_to_seconds(now()), case S rem 2 of 0 -> {true, Request, Context}; 1 -> {“Basic realm=lucky”, Request, Context} end.

Resource Functions

Requests• The first argument to each resource function • Set and read request & response data

wrq:set_resp_header(“X-Answer”, “42”, Request).

RemoteIP = wrq:peer(Request).

content_types_provided(Request, Context) -> Types = [{"application/json", to_json}], {Types, Request, Context}.

to_json(Request, Context) -> Device = proplists:get_value(device, Context), UserId = get_user_id(Request),

case fe_api_firewall:get_config(Device, UserId) of

{ok, Config} -> success_response(Config, Request, Context);

{error, Reason} -> error_response(502, Reason, Request, Context) end.

Retrieving a JSON Firewall Representation

Gotchas

primitive obsession

string-ish

[<<"easy as ">>, [$a, $b, $c], " ☺\n"].

“hi how are you”

<<“hello there”>>

hashes vs records

to loop is human,to recur divine

Erlang conditionals always return a value

design for testability

don’t spawn,

use OTP

Downsides

Erlang changes very slowly

3rd party libraries

standard librarycan be inconsistent

package management

Questions

Phil: @philtolandhttp://github.com/tolandhttp://philtoland.com

Mike: @lifeinzemblahttp://github.com/msassak

http://spkr8.com/t/7806