Magic of Ruby

46
the magic of ruby

description

A little story to explain why I think ruby is pure magic

Transcript of Magic of Ruby

Page 1: Magic of Ruby

themagicof

ruby

Page 3: Magic of Ruby

once upona time there

was adeveloperworking for a big company

Page 4: Magic of Ruby

it’s very easy, we

need something quick and

dirty...

in the beginning...

Page 5: Magic of Ruby

1. login2. go to a

page3. scrap a

number4. that’s it!!!

in the beginning...

Page 6: Magic of Ruby

ok!

Page 7: Magic of Ruby

require "mechanize"

agent = Mechanize.new do | agent | agent.user_agent_alias = "Linux Mozilla"end

agent.get("http://example.com/login") do | login_page |

result_page = login_page.form_with(:name => "login") do | login_form | login["username"] = username login["password"] = password end.submit

result_page.search("//table[starts-with(@class,'boundaries')]").map do | option_table | { "name" => option_table.search("./caption/child::text()") "credits" => option_table.search("./descendant::td[position()=3]/child::text()") } end

end

enter mechanize + nokogiri

Page 8: Magic of Ruby

and then...

good, but we’d like to extract more informationsfrom a few different

pages

Page 9: Magic of Ruby

enter commander

command :is_registered do | command | command.syntax = "is_registered --username TELEPHONE_NUMBER [ --without-cache ]" command.description = "Check if user is registered" command.option "-u", "--username TELEPHONE_NUMBER", String, "user's telephone number" command.option "-n", "--without-cache", "bypass user's profile informations cache"

command.when_called do | arguments, options | options.default :username => "", :without_cache => false ok(is_registered(options.username, options.without_cache)) endend

extract code into functions

describe arguments

Page 10: Magic of Ruby

use page object pattern

def is_registered(username) browse do | agent, configuration | LoginPage.new( agent.get(configuration["login_page_url"]) ).is_registered?(username) endend

Page 11: Magic of Ruby

use page object pattern

class LoginPage < PageToScrub

def is_registered?(username) begin login(username, "fake password") rescue WrongPassword true rescue NotRegistered, WrongUsername, WrongArea false end end

def login(username, password) check_page( use_element(:login_form) do | login | login["username"] = username login["password"] = password end.submit ) end

def login_form @page.form_with(:name => "login") end

Page 12: Magic of Ruby

use page object pattern

class LoginPage < PageToScrub

def is_registered?(username) begin login(username, "fake password") rescue WrongPassword true rescue NotRegistered, WrongUsername, WrongArea false end end

def login(username, password) check_page( use_element(:login_form) do | login | login["username"] = username login["password"] = password end.submit ) end

def login_form @page.form_with(:name => "login") end

useful abstractions

Page 13: Magic of Ruby

use page object pattern

class PageToScrub

...

def use_element(element_name) element = self.send(element_name) raise MalformedPage.new(@page, "unable to locate #{element_name}") if ( element.nil? || (element.empty? rescue true) ) return yield(element) if block_given? element end

...

end

Page 14: Magic of Ruby

...few pages my A@@

45 pages and 93 different

pieces of data

after a while...

Page 15: Magic of Ruby

i need to feel more confidentwith this...

Page 16: Magic of Ruby

rspec is your friend :-)

describe "is_registered" do

context "XXX3760593" do

it "should be a consumer registered" do result = command(:is_registered, :username => "XXX3760593") result.should_not be_an_error result["area"].should == "consumer" result["registered"].should == true end

end

end

Page 17: Magic of Ruby

and then...

obviously not all the requests can be live

on our systems

Page 18: Magic of Ruby

enter the cache

def browse begin cache = CommandCache.new(database_path) configuration = YAML::load(File.open(configuration_path)) agent = Mechanize.new do | agent | agent.user_agent_alias = "Linux Mozilla" end yield(agent, configuration, cache) rescue Mechanize::ResponseCodeError => error failure(LoadPageError.new(error)) rescue Timeout::Error failure(TimeoutPageError.new) rescue ScrubError => error failure(error) rescue => error failure(UnknownError.new(error.to_s)) ensure cache.close! endend

Page 19: Magic of Ruby

enter the cache

def is_registered(username, without_cache) browse do | agent, configuration, cache | cache.command([ username, "is_registered" ]) do LoginPage.new( agent.get(configuration["login_page_url"]) ).is_registered?(username) end endend

single line change

Page 20: Magic of Ruby

enter the cache

class CommandCache

def initialize(database_path) @database = create_database(database_path) end

def command(keys) begin from_cache(keys) rescue NotInCache => e raise e if not block_given? to_cache(keys, yield) end end

end

better ask forgiveness than

permission

Page 21: Magic of Ruby

and then...

our systems cannot take

more than 25 concurrent requests...

make sure of it!!!

Page 22: Magic of Ruby

@!#$$@&#...

maybe we can use a proxy

...

Page 23: Magic of Ruby

god bless mechanize

def browse begin cache = CommandCache.new(database_path) configuration = YAML::load(File.open(configuration_path)) proxy = configuration["proxy"] agent = Mechanize.new do | agent | agent.user_agent_alias = "Linux Mozilla" agent.set_proxy(proxy["host"], proxy["port"]) if proxy end yield(agent, configuration, cache) rescue Mechanize::ResponseCodeError => error failure(LoadPageError.new(error)) rescue Timeout::Error failure(TimeoutPageError.new) rescue ScrubError => error failure(error) rescue => error failure(UnknownError.new(error.to_s)) ensure cache.close! endend

single line change

Page 24: Magic of Ruby

and then...

well, you know, we have a lot of users, so when proxy says is

overloaded you must retry a few times before give

up

Page 25: Magic of Ruby

@!#$$@&#

Page 26: Magic of Ruby

class Mechanize

alias real_fetch_page fetch_page

def fetch_page(params) ... attempts = 0 begin attempts += 1 real_fetch_page(params) rescue Net::HTTPServerException => error if is_overloaded?(error) sleep wait_for_seconds and retry if attempts < retry_for_times raise SystemError.new("SystemOverloaded") end raise error end end

def is_overloaded?(error) error.response.code == "403" end

end

god bless ruby

look at this line!!!

Page 27: Magic of Ruby

we can also test it :-)

class WEBrick::HTTPResponse

def serve(content) self.body = content self["Content-Length"] = content.length end

def overloaded serve("<html><body>squid</body></html>") self.status = 403 end

end

proxy = WEBrick::HTTPProxyServer.new( :Port => 2200, :ProxyContentHandler => Proc.new do | request, response | response.overloaded end)

trap("INT") { proxy.shutdown }proxy.start

Page 28: Magic of Ruby

finally ;-)

well... i guess we can release it...

Page 29: Magic of Ruby

the unexpected

but... wait...our i.t.

department said that

sometimes it crashes

Page 30: Magic of Ruby

you need to fix it by

tomorrow!!!

the unexpected

Page 31: Magic of Ruby

@!#$$@&#@!#$$@&#@!#$$@&#@!#$$@&#@!#$$@&#@!#$$@&#

Page 32: Magic of Ruby

If you want something done, do it yourselfhow to transform a command line program into a web application

class ScrubsHandler < Mongrel::HttpHandler

def process(request, response) command = request.params["PATH_INFO"].tr("/", "") elements = Mongrel::HttpRequest.query_parse(request.params["QUERY_STRING"]) parameters = elements.inject([]) do | parameters, parameter | name, value = parameter parameters << if value.nil? "--#{name}" else "--#{name}='#{value}'" end end.join(" ") response.start(200) do | head, out | head["Content-Type"] = "application/json" out.write(scrubs.execute(command, parameters)) end end

...end

almost a single line change

Page 33: Magic of Ruby

can this be true ?!?!?

well... i guess we can release it...

Page 34: Magic of Ruby

all the requests are live!!!

our systems are melting down!!! fix it!!! now!!!

after a while...

Page 35: Magic of Ruby

@!#$$@&#@!#$$@&#@!#$$@&#@!#$$@&#@!#$$@&#@!#$$@&#

Page 36: Magic of Ruby

change the cache implementationuse the file system luke...

def expire(keys, result = nil) FileUtils.rm path(keys), :force => true result.merge({ "from_cache" => false }) unless result.nil? end

def expire_after(keys, seconds, result = nil) expire(keys, result) if (from_cache(keys)["cached_at"] + seconds) <= now rescue nil end

def from_cache(keys) cache_file_path = path(keys) raise NotInCache.new(keys) unless File.exists?(cache_file_path) JSON.parse(File.read(cache_file_path)).merge({ "from_cache" => true }) end

def to_cache(keys, result) result = result.merge({ "cached_at" => now }) File.write(path(keys), JSON.generate(result)) result.merge({ "from_cache" => false }) end

Page 37: Magic of Ruby

are you handling the maintenance page right?

after a while...

Page 38: Magic of Ruby

maintenance page detection

class PageToScrub

def initialize(page) @page = page check_page_errors check_for_maintenance end

def check_for_maintenance @page.search("//td[@class='txtbig']").each do | node | if extract_text_from(node.search("./descendant::text()")) =~ /^.+?area.+?clienti.+?non.+?disponibile.+?stiamo.+?lavorando/im raise OnMaintenancePage.new(@page, "??? is on maintenance") end end end

...

end

Page 39: Magic of Ruby

good job gabriele, it’s

working beyond our expectations

after few days

Page 40: Magic of Ruby

tell me, these

“robots” of yours can be

used to check our systems

after few days

Page 41: Magic of Ruby

...

yes!

Page 42: Magic of Ruby

in the end...

• almost all self care’s features are replicated

• ~500.000 unique users/day• ~12.000.000 requests/day• ~4gb of cached data• specs are used to

monitoring the entire system

Page 43: Magic of Ruby

but in the beginning was...

it’s very easy, we

need something quick and

dirty...

Page 44: Magic of Ruby

that is for me the ruby

magic :-)

Page 45: Magic of Ruby

questions?