Download - Rubyslava slides-26.09.2013

Rubyslava / PyVo #32

26.09.2013

Imperative versus Functional Programming

Jan Herich

@janherich@itedge

Core aspects of Imperative Programming


● Emphasis on mutable state➢ In place modification of variables (memory locations)➢ The flow of the program is determined by directly checking

those memory locations → typical example is imperative looping : for(int i=1; i<11; i++) { System.out.println("Count is: " + i); }




● Rooted in single-threaded premise➢ Assuming that there is only one thread of execution➢ So the world is effectively “stopped” when you look at it or

change it




● Rooted in single-threaded premise➢ Assuming that there is only one thread of execution➢ So the world is effectively “stopped” when you look at it or

change it

● Prevalent in most OO languages➢ C++, Java, C#, Python, Ruby, etc.

What's wrong withImperative programming ?


● Uncoordinated mutation ➢ No built-in facilities in the language to coordinate changes➢ It could result in brittle systems, even without concurrency➢ Add concurrency/parallelism and everything only get worse


● Uncoordinated mutation ➢ No built-in facilities in the language to coordinate changes➢ It could result in brittle systems, even without concurrency➢ Add concurrency/parallelism and everything only get worse

● Complecting the state and identity in OO➢ Object reference → Identity and value mixed together➢ Object (identity) is a pointer to the memory that contains the

value of its state➢ There is no way to observe a stable state (even to copy it)

without blocking others from changing it➢ There is no way to associate the identity's state with a different

value other than in-place memory mutation

Example of harmful mutation Our initial objects:

var record1 = {state_id:'S2',county_id:'C1',population:3439,area:97};var record2 = {state_id:'S5',county_id:'C2',population:85345,area:128};var record3 = {state_id:'S2',county_id:'C3',population:7435,area:157};

Example of harmful mutation

Reasonably nice function without side effects:

var groupRecords = function(records,key) { var groups = {}; for (var i = 0; i < records.length; i++) { var current = records[i]; var keyValue = current[key]; if (!groups.hasOwnProperty(keyValue)) { groups[keyValue] = []; } groups[keyValue].push(current); } return groups;};

Our initial objects:


New grouped datastructure:

var grouped = groupRecords([record1, record2, record3], 'county_id');

Example of harmful mutation

Reasonably nice function without side effects:

var groupRecords = function(records,key) { var groups = {}; for (var i = 0; i < records.length; i++) { var current = records[i]; var keyValue = current[key]; if (!groups.hasOwnProperty(keyValue)) { groups[keyValue] = []; } groups[keyValue].push(current); } return groups;};

Ugly side-effect function, written by unexperiencedprogrammer, who doesn't know much about objectreferences:

var sumRecords = function(records) { var first = records[0]; for (var i = 1; i < records.length; i++) { var current = records[i]; first.population += current.population; first.area += current.area; } return first;};

Our initial objects:


New grouped datastructure:

var grouped = groupRecords([record1, record2, record3], 'county_id');

We don't expect that this function call will mutate the former object record1:

var state2summed = sumRecords(grouped.S2);

How can we do better ?


● What if every datastructure in your program would be immutable ?➢ It would solve our problem with leaking mutable references

all over the codebase



all over the codebase● But how can we model any progress if everything is static ?

➢ It's nice to be safe, but how can we actually accomplish anything if everything is immutable so we can't change it ? It turns out we can, if we use persistent data structures → that we can't change something in place doesn't mean that we can't model progress



all over the codebase● But how can we model any progress if everything is static ?

➢ It's nice to be safe, but how can we actually accomplish anything if everything is immutable so we can't change it ? It turns out we can, if we use persistent data structures → that we can't change something in place doesn't mean that we can't model progress

● What is a persistent data structure ?➢ Persistent data structure is a data structure that always preserves

the previous version of itself when it is modified. Such data structures are effectively immutable, as their operations do not (visibly) update the structure in-place, but instead always yield a new updated structure

How persistent data structures work

● Creation of new datastructures must be fast and efficient.

● This is achieved by structural sharing

● Obviously, the garbage collection is a must in this case

xs

d

b g

a c f h

ys

d'

g'

f'

e

Our example revisited Our initial references to VALUES:

(def record-1 {:state-id "S2" :county-id "C1" :population 3439 :area 97})(def record-2 {:state-id "S5" :county-id "C2" :population 85345 :area 128})(def record-3 {:state-id "S2" :county-id "C3" :population 7435 :area 157})

Our example revisited

(defn group-records [records key] (reduce (fn [accumulator record] (let [key-val (get record key) subrecords (get accumulator key-val [])] (assoc accumulator key-val (conj subrecords record)))) {} records))

Our initial references to VALUES:


New merged VALUE:

(def grouped (group-records [record-1 record-2 record-3] :state-id))

Our example revisited

(defn group-records [records key] (reduce (fn [accumulator record] (let [key-val (get record key) subrecords (get accumulator key-val [])] (assoc accumulator key-val (conj subrecords record)))) {} records))

Our inexperienced programmer and his function again:

(defn alter-records [records] ;; the first record remains unchaged ;; because it is a reference ;; to value, and values don't ;; change :) (assoc (first records) :population 0))

Our initial references to VALUES:


New merged VALUE:

(def grouped (group-records [record-1 record-2 record-3] :state-id))

We created new VALUE state-2-altered, but our record-1remains unchanged, even if those two VALUES partiallyshare their structure:

(def state-2-altered (alter-records (grouped "S2")))

Key points from our Imperative/Functional comparision

● In our imperative example, we created many more identities then we really needed.

● The identities we created (record1, record2, record2) would be better modeled as values instead.

● The value of values should not be undervalued :)

But what if we really need identities ?

● Most programs need identities ➢ There are programs resembling huge functions such as

compilers, but most other programs need to model identities

● It's worthwhile to separate identity and state➢ Instead of thinking about identity states as a contents of the

particular memory block, it's better to think about it as a value currently associated with the identity

➢ The identity can be in different states in different times, but the state itself doesn't change

➢ Thus, the identity is not a state, the identity has a state, exactly one at any point of time

How do we model identities ?

● We need atomic references to values ➢ Because every 'value-swap' of the identity (remember, every

identity has a state, which is immutable value) needs to be atomic (similar to atomic database commits, always resulting in consistent database state)

➢ In Clojure, those changes to references are controlled and coordinated by the system – so the cooperation is not optional and not manual

➢ The world moves forward due to the cooperative efforts of its participants and the programming language/system, Clojure, is in charge of world consistency management. The value of a reference (state of an identity) is always observable without coordination, and freely shareable between threads

Example of atomic reference in Clojure

We define out initial cache data:(def initial-data ["D1" "D2" "D3"])

Now we create an special atomic reference – named cache:(def cache (atom initial-data)) We read the cache into some intermediate variable:(def cache-data (deref cache)) The result is true:(= initial-data cache-data) We swap the cache for different value -> we add "D4" and "D5" in this case:(swap! cache conj ["D4" "D5"]) Whenever the cache is dereferenced, we get the new data:(deref cache) But the former cache reading cache-data is still unchanged, so this remains true:(= initial-data cache-data)

There is a lot more to discover

● There are more reference types in Clojure, but that is out of scope of this talk

● And of course not only that, there are many, many more cool features in Clojure which you wouldn't find in other languages such as for example multimethods or true macros for metaprogramming

● Visit http://clojure.org/ to learn more

http://clojure.org/