Yet another object system for R

Yet another object system for R

Hadley WickhamRice University

Friday, 31 July 2009

Does R need another object system?


Are S3 and S4 enough?


“Because it’s there”—George Mallory

CC BY: http://www.flickr.com/photos/mckaysavage/497617014


http://www.flickr.com/photos/mckaysavage/497617014

http://www.flickr.com/photos/mckaysavage/497617014

1. Paradigms of programming

2. Existing systems in R

3. Prototype based programming

4. Examples

5. Scoping

6. Uses


Programming Paradigms for Dummies

No

Named stateUnnamed state (seq. or conc.)

More

Expressiveness of state

Less

nondeterminism?Observable

Yesfunctional

programming

Descriptive

declarative

programming

Imperative

search

programming

Event!loop

programming

Multi!agent

programming

Message!passing

concurrent

programming

Data structures only

+ unification

Dataflow and

Oz, Alice, Curry Oz, Alice, Curry

CLU, OCaml, Oz

E in one vat

Continuation

programming

Logic and

constraints message passing Message passing Shared state

+ nondeterministic

(channel)

Oz, Alice, Curry, Excel,

AKL, FGHC, FCP

+ synch. on partial termination

FrTime, Yampa

Discrete synchronous

programming

Esterel, Lustre, Signal

Functional reactive

programming (FRP)

Continuous synchronous

programming

Pipes, MapReduce

Nondet. state

Erlang, AKL

CSP, Occam,E, Oz, Alice,

publish/subscribe,tuple space (Linda)

+ clocked computation

Dijkstra’s GCL

+ cell (state)

+ nondet. choice

programming

Imperative

Pascal, C

programming

Guarded

command

choice

Nonmonotonic

dataflow

programming

Concurrent logic

programming

Oz, Alice, AKL

+ port

Multi!agent

dataflow

programming

+ local cell

Active object

programming

Object!capability

programming

Turing complete

Java, OCaml

+ closure

embeddings

+ solver

LIFE, AKL

CLP, ILOG Solver

+ thread+ single assignment

+ thread

Smalltalk, Oz,

+ thread

Java, Alice

+ log

+ cell(state)

Functional

SQL embeddings

Prolog, SQL

+ search

record

XML,S!expression

Haskell, ML, E

(unforgeable constant)

+ cell

Scheme, ML

+ procedure

+ closure

SNOBOL, Icon, Prolog

+ search

(channel)+ port

Scheme, ML

(equality)+ name

+ by!need synchronization

+ by!needsynchronization

+ thread

+ continuation

Lazy concurrent

object!oriented

Concurrent

programming

Shared!state

concurrent

programming

Software

transactional

memory (STM)

Sequential

object!oriented

programming

Stateful

functional

programming

Lazy

declarative

concurrent

programming

programming

Lazy

dataflow

Concurrent

constraint

programming

constraint

programming

Constraint (logic)

programming

Relational & logic

programming

Deterministic

logic programming

synchron.+ by!need + thread

+ single assign.

Haskell

Lazy

functional

programming

Monotonic

dataflow

programming

Declarative

concurrent

programming

ADT

functional

programming

ADT

imperative

programming

Functional

programming

First!order

Figure 2. Taxonomy of programming paradigms

2.1 Taxonomy of programming paradigms

Figure 2 gives a taxonomy of all major programming paradigms, organized in a graphthat shows how they are related [55]. This figure contains a lot of information and re-wards careful examination. There are 27 boxes, each representing a paradigm as a setof programming concepts. Of these 27 boxes, eight contain two paradigms with di!erentnames but the same set of concepts. An arrow between two boxes represents the conceptor concepts that have to be added to go from one paradigm to the next. The conceptsare the basic primitive elements used to construct the paradigms. Often two paradigmsthat seem quite di!erent (for example, functional programming and object-oriented pro-gramming) di!er by just one concept. In this chapter we focus on the programmingconcepts and how the paradigms emerge from them. With n concepts, it is theoreticallypossible to construct 2n paradigms. Of course, many of these paradigms are useless inpractice, such as the empty paradigm (no concepts)1 or paradigms with only one concept.A paradigm almost always has to be Turing complete to be practical. This explains whyfunctional programming is so important: it is based on the concept of first-class function,

1Similar reasoning explains why Baskin-Robbins has exactly 31 flavors of ice cream. We postulatethat they have only 5 flavors, which gives 25 ! 1 = 31 combinations with at least one flavor. The 32nd

combination is the empty flavor. The taste of the empty flavor is an open research question.

13


Programming Paradigms for Dummies: What Every Programmer Should Know. Peter Van Roy. http://www.info.ucl.ac.be/~pvr/VanRoyChapter.pdf

P. van Roy and S. Haridi. Concepts, Techniques and Models of Computer Programming. The MIT Press, 2004.


http://www.info.ucl.ac.be/~pvr/VanRoyChapter.pdf

http://www.info.ucl.ac.be/~pvr/VanRoyChapter.pdf

S3 / S4 Alternative

Immutable(pass-by-value)

Mutable(pass-by-reference)

Generic functions(function based OO)

Message passing(class based OO)


Mutable Immutable

Required by many of most efficient algorithms

Hard to derive computational complexity

Simplifies dependence between components

Must “thread” state

Concurrency hard Concurrency easy

Hard to reason aboutCompilers can make

very efficient


Aside: memory efficiency

“A persistent data structure is a data structure which always preserves the previous version of itself when it is modified.”

“While persistence can be achieved by simple copying, this is inefficient in time and space, because most operations make only small changes to a data structure. A better method is to exploit the similarity between the new and old versions to share structure between them, such as using the same subtree in a number of tree structures.”

http://en.wikipedia.org/wiki/Persistent_data_structureFriday, 31 July 2009

http://en.wikipedia.org/wiki/Data_structure

http://en.wikipedia.org/wiki/Data_structure

http://en.wikipedia.org/wiki/Tree_structure

http://en.wikipedia.org/wiki/Tree_structure

http://en.wikipedia.org/wiki/Persistent_data_structure

http://en.wikipedia.org/wiki/Persistent_data_structure

Existing systems

S3/S4: immutable + generic functions

R.oo : mutable + generic functions

OOP: mutable + message passing

Proto: mutable + message passing

YAROS: mutable + message passing


Prototype based programming

Generalisation of class-based oo (like Java) that removes the distinction between classes and instances.

Single dispatch, but often implement multiple (& dynamic) inheritance

Notable languages: Javascript, Io


Counter <- Object$clone()$do({ init <- function() self$counter <- 0 count <- function() { self$counter <- self$counter + 1 self$counter }})

counter_a <- Counter$clone()counter_b <- Counter$clone()

counter_a$count()counter_a$count()counter_b$count()


Account <- Object$clone()$do({ balance <- 0.0 deposit <- function(v) self$balance <- self$balance + v withdraw <- function(v) self$balance <- self$balance - v show <- function() cat("Account balance: $", self$balance, "\n") init <- function() self$balance <- 0})

Account$show()cat("Depositing $10\n")Account$deposit(10.0)Account$show()

Savings <- Account$clone()$do({ interest <- 0.05 withdraw <- NULL})Savings$show()


Observable <- Object$clone()$do({ listeners <- list() add_listener <- function(f) { self$listeners <- c(self$listeners, f) }

signal <- function(...) { for(l in self$listeners) l(...) } init <- function() { self$listeners <- list() }})

counter_a$append_proto(Observable$clone())


Advantages

Clean separation of object methods and base functions.

No extra function parameters.

Rich behaviour, including introspection, based on io.

Mutable, multiple inheritance (depth-first search of inheritance graph).


Scoping

self$method() needs to have object scoping, not lexical scoping.

That is, rather than looking up based on environment at time function was defined, look it up based on current object.


"$.io" <- function(x, i, ...) { res <- core(x)$get_local_slot(i) object_scope(res, x)}

object_scope <- function(res, self) { # Add environment to top of stack that contains # the self object if (is.function(res)) { env <- new.env() env$self <- self parent.env(env) <- environment(res) environment(res) <- env } res}


Can simplify code which requires coordinating data from multiple locations: scale code in ggplot2

Complex simulations: e.g. card counting in black jack

When there really is one true underlying object: GUIs and interactive graphics

Uses


ConclusionsS3/S4 don’t fulfil every need, and it’s fun to experiment with alternative paradigms.

Prototype-based OO is an interesting idea that the distinction between inheritance and instantiation.

We can actually implement it in R, with different scoping rules. Has been a great learning experience.


Yet another object system for R

Education

Transcript of Yet another object system for R