Transcript of erlang at hover.in , Devcamp Blr 09
- 1. when erlang makes sense ( erlang at hover.in )
http://developers.hover.in Bhasker V Kode co-founder& CTO at
hover.in at devCamp, Bangalore Apr 11, 2009
- 2. brief introduction to hover.in choose words from your blog,
& decide what content / ad you want when you hover* over it *
or other events like click,right click,etc
http://developers.hover.in
- 3. brief introduction to hover.in or... the worlds first
publisher drivenin-text content & ad delivery platform...
http://developers.hover.in
- 4. brief introduction to hover.in or lets web publishers push
client-side event handling to the cloud ,to run various rich
applications calledhoverlets http://developers.hover.in
- 5.
- hover.in founded late 2007
http://developers.hover.in http://developers.hover.in
- 6.
- hover.in founded late 2007
- the web ~ 10- 20 years old
http://developers.hover.in http://developers.hover.in
- 7.
- hover.in founded late 2007
- the web ~ 10- 20 years old
- humans 100's of thousands of years
http://developers.hover.in http://developers.hover.in
- 8.
- hover.in founded late 2007
- the web ~ 10- 20 years old
- humans 100's of thousands of years
- butbacteria .... around for millions of years ... so this talk
is going to be about what we can learn from bacteria, the brain ,
and memory in a concurrent world
http://developers.hover.in http://developers.hover.in
- 9. where are we heading
- the C10k problem & beyond
- disk , hosting getting comparitively cheaper
- horizontal scaling, moving beyond a single node
- era of multi-core computing, how do we utilize more cpu's
packed into each chip
http://developers.hover.in
- 10. erlang
- functional programming language , ideal for
concurrent,distributed, fault-tolerant applications
- relies on message passing
- immutable state, .erl files compiled to .beam
- can explicitly decide if tasks need to done:
-
- syncronously or asychronously
-
- in parallel (concurrent) , or distributed (multi-node)
http://developers.hover.in
- 11. demo
- erl -name 'node1@0.0.0.0' -setcookie secret
- erl -name 'node2@0.0.0.0' -setcookie secret
- > net_adm:ping('node1@0.0.0.0 ') .
- now that they recognize each other , tryrpc:call
(TargetNode,Module,Fn,Args). spawn ( Node , Fn )spawn ( Node,
Module, Fn, Args ) .
http://developers.hover.in
- 12. this is how erlang looks like...
- lists: foldl ( fun( X , Accumulator ) -> Rem = X rem 2,case
Rem of 0 ->Accumulator ++ [X]; _ ->Accumulator
- end,[ ], lists: seq( 1, 10 )).[2,4,6,8,10]
http://developers.hover.in
- 13. bacteria that exhibit group dynamics
- bacteria peforms group operation like a lists fold
operation
- each bacteria spawns its own set of proteins, when only when
theAccumulatoris > some threshhold, will group dynamics of
making light (bioluminiscence) kick ( eg: deep sea animals)
- All bacteria have some sort ofsome presence & replies
associated which are queried
http://developers.hover.in
- 14.
- in erlang, you can create a new concurrent process that
evaluates a Fun.
-
-
- Pid = spawn (fun() ->%% do somethingend )
- Pid !Msg , message sending is asynchronous
- erlang can be set to utilize smp support
http://developers.hover.in
- 15. what does smp mean
- SMP (Structured Message Passing) supports the dynamic
construction of process families that communicate through
asynchronous messages
- Process families can be connected together
- Each process can communicate with its parent, its children, and
a subset of its siblngs, as specified by the family topology
http://developers.hover.in
- 16. pattern matching
- each molecule connects to its specific receptor protein to
complete the missing piece,to ingite the group behaviour that are
only succesful when all of the cells participate in unison.
- Type = case UserType ofuser -> true; admin -> true;
_Else-> false end
http://developers.hover.in
- 17. binary pattern matching
- Suggest= fun(UserTyped)-> lists:foldl ( fun(X, Acc) ->
Size = Get_size(UserTyped), %% 8bits per char case UserTyped of
-> Acc ++ X; _Else > Accend end , [] , Words ) end) ,
Suggest()gives[,]
http://developers.hover.in
- 18. fault-tolerance
- father of DNA, says that all humans have 10 places in our
genome, where we have lost one or gained an another one
- erlang catches timeouts for receiving a reply once you spawn a
process, you can monitor it & link errors, or use the
erlang/OTP architecture to supervise process's and set restart
strategies
- enabled ericcson to run with 9 9's uptime (99.999999999 %)
http://developers.hover.in
- 19. fault tolerance in the real world
- for a singlegoogle search result , the same requests are sent
to multiple machines( ~1000 as of 09), which ever replies the
quickest wins.
- inamazon's dynamo architecturethat powers S3, use a (3,2,2)
rule . ie Maintain 3 copies of the same data, reads/writes are
succesful only when 2 concurrent requests succeed. This ratio
varies based on SLA, internal vs public service. (more on conflict
resolution...)
http://developers.hover.in
- 20. inter-species communication
- if you look at your skin consists of very many different
species, but all bacteria found to communicate using one common
chemical language.
http://developers.hover.in
- 21. inter-species communication
- if you look at your skin consists of very many different
species, but all bacteria found to communicate using one common
chemical language.hmmmmmmmmmmmmmmmmmmm..............
....serialization ?! ....a common protein interpret ?! ....or
perhaps just in time protein compilation?!
http://developers.hover.in
- 22. interspecies comm. in the real world
- attempts atserialization, cross language communication
include:
-
- protocol buffers( by google)
-
- base64 en/decoding , port based communication ( erlangpython at
hover.in )
- as for interspecies communication look no further:
http://developers.hover.in
- 23. interspecies comm. in the real world
- as for interspecies communication in the real world... look no
further: JRuby !!!!IronPython!!!GWT !!!...& coming to a theatre
near you this..... FortScala ???FlashTheApple++ ???
VisualJavaTranScriptual ????
http://developers.hover.in
- 24. talking about scaling
- The brain of the worker honeybee weighs about 1mg, the total
number of neurons in its brain is estimated to be 950,000
- Flies acrobatically , ecognizes patterns, navigates , norages ,
communicates
- Energy consumption: 1015 J/op, at least 106 more efficient than
digital silicon
http://developers.hover.in
- 25. the human brain
- 100 billion neurons, stores ~100 TB
- Differential analysis e.g., we compute color
- Multiple inputs: sight, sound, taste, smell, touch
- Facial recognition subcircuits, peripheral vision
- in essence- the left & right brain vary in: left ->
persistent disk , handles past/future right -> temporal caches!
, handles present
http://developers.hover.in
- 26. what we can learn
- whysharding datais critical in concurrent programming ( DB
tx'ion locks )
- implementingflowcontrol ( eg: memory game)
- event based alarms, timeouts,supervised workers
- wrt performance/scaling , you can't improve what you
cantmeasure
http://developers.hover.in
- 27. working with data concurrently?
- typical web backends all user data in one table then clustering
justsplits that by artibary basis. Query content table where
user=user1,
- what if you have N concurrentprocess's accessing N diff user
tables no locks, you can parallize since sufficiently
un-related.
- same with mapreduce algo's, if the data
issufficientlyunrelated, then parallelized easy, results can come
back asynchronously
http://developers.hover.in
- 28. retrieving data concurrently?
- replicationvslocation transparency, are they fragmented, are
some nodes read-only ? (rpc...)
- need metadata for which node to acess for user1, (or use
hashing fn like memcache)
- are tables in-memory (right brain ), cached from disk , or on
disk alone ( left brain )
- mnesia, erlang's inbuilt ~database lets you make highly
granular choices
http://developers.hover.in
- 29. measurement
- you can't improve what you can't measure.
- introducing theheat-seeking algoat hover.in
- usingtsung(written in erlang again ) load performance testing
tool, for simulating 100's of concurrent users/requests , and great
for analysing bottlenecks of your system, benchmarking (content
delivery networks (CDN's ) , etc
http://developers.hover.in
- 30. built-in datatypes
- atoms, integers, floats, tuples, lists
- and unlike mysql, you can store complex datatypes into mnesia ,
and pattern match them
- 31. temporal data
- erlang/OTP comes with building blocks for making granular
choices in data structures:
-
- gen_servers -> client/server architecture for process's
-
- get_fsm -> finite state machine , etc
- move over db, files. A single erlang process spawnedcan hold
state. we use it to build own own set/get based cache workers.
http://developers.hover.in
- 32. temporal data in the real world
- you listen to a phone number in batche of 3 or 4 digits. the
part that absorbs just before writing (temporal), until you write
into your contact book or memorize it ( persistent)
- a smart way of building counters that move ultra-fast in
erlang, would be a gen_server that resides in-memory, accepting
requests via a flowcontrol , getting a new state , and then writing
to disc/db when conveneient.
http://developers.hover.in
- 33. more on sychcronous,asynchronous
- gen_sever's let you makesynchonouscall's that are blocking (
wait till it returns with the result ), and can catch timeouts,
restarts,etc.
- or can be non-blocking asynchronouscasts ( send the instruction
of sending a mail,and return to thank you page immediately, dont
wait for the mail to be sent.
- ofcourse or you use the spawn,pid to write your own
implementation of gen_*'s
http://developers.hover.in
- 34. flowcontrol
- queues to handle intents of reads/writes, to determine
bottlenecks.(eg ur own,rabbitmq,etc )
- eg1:when we addjobs to the queue, if it takes greater than X
consistently we move it to high traffic bracket, do things
differently, possibly add workersorignore based on the task.
- eg2:amazon shopping carts, are known to be extra resilient to
write failures, (dont mind multiple versions of them over
time)
http://developers.hover.in
- 35. supervisors, workers
- as bacteria grow, they split into two. when muscle tears, it
knows exactly what to replace.
- erlang supervisors can decide restart policies: if one worker
fails, restart all .... or if one worker fails, restart just that
worker, more tweaks.
- can spawn multiple workers on the fly, much like the need for
launching a new ec2 instant
http://developers.hover.in
- 36. how do you know where to start/stop
- build your ownbacteria antidote ,stress tests to see, on your
typical production server :
-
- how many process's can u create, how many open file sockets can
you have,( system limitations, tweak )
-
- how mant tables can you store,
-
- how many rows will it take before it getsslow ( time to
fragment )
-
- learn from the brain, gr8 videos on ted.com
- 37. summary of tech at hover.in
- 3 node cluster (64-bit 4gb )on the LYME stack
- python crawler, associated NLP parsers, mini metadata
interpreter in client-side js , maybe moving to spidermonkey
- remote node debugger/handler, flowcontrol, heat-seeking, cpu
time-splicing algo's, headless-firefox for thumbnails queue
- touching 500k hovers/month in Apr'09 , upwards of 25million S3
GET requests/month
http://developers.hover.in
- 38. summary of our erlang modules
- rewrites.erl error.erl frag_mnesia.erl
hi_api_response.erlhi_appmods_api_user.erlhi_cache_app.erl ,
hi_cache_sup.erlhoverlingo.erl
hi_cache_worker.erlhi_classes.erlhi_community.erl
hi_cron_hoverletupdater_app.erlhi_cron_hoverletupdater.erl
hi_cron_hoverletupdater_sup.erlhi_cron_kwebucket.erlhi_crypto.erl
hi_flowcontrol_hoverletupdater.erl hi_htmlutils_site.erl
hi_hybridq_app.erl hi_hybridq_sup.erl hi_hybridq_worker.erl
hi_login.erl hi_mailer.erl hi_messaging_app.erl
hi_messaging_sup.erl hi_messaging_worker.erl hi_mgr_crawler.erl
hi_mgr_db_console.erl hi_mgr_db.erl hi_mgr_db_mnesia.erl
hi_mgr_hoverlet.erl hi_mgr_kw.erl hi_mgr_node.erl hi_mgr_thumbs.erl
hi_mgr_traffic.erl hi_nlp.erl hi_normalizer.erl
hi_pagination_app.erlhi_pagination_sup.erl,
hi_pagination_worker.erl hi_pmap.erlhi_register_app.erl
hi_register.erl, hi_register_sup.erl, hi_register_worker.erl
hi_render_hoverlet_worker.erl hi_rrd.erl , hi_rrd_worker.erl
hi_settings.erlhi_sid.erl hi_site.erl hi_stat.erl
hi_stats_distribution.erl hi_stats_overview.erl hi_str.erl
hi_trees.erl hi_utf8.erl hi_yaws.erl
http://developers.hover.in
- 39. thank you http://developers.hover.in
- 40. references
-
http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf
- all the amazing brain-related talks athttp://ted.com,
- shoutout to everyone at #erlang !
http://developers.hover.in