Yet another object system for R

20
Yet another object system for R Hadley Wickham Rice University Friday, 31 July 2009

Transcript of Yet another object system for R

Page 1: Yet another object system for R

Yet another object system for R

Hadley WickhamRice University

Friday, 31 July 2009

Page 2: Yet another object system for R

Does R need another object system?

Friday, 31 July 2009

Page 3: Yet another object system for R

Are S3 and S4 enough?

Friday, 31 July 2009

Page 4: Yet another object system for R

“Because it’s there”—George Mallory

CC BY: http://www.flickr.com/photos/mckaysavage/497617014

Friday, 31 July 2009

Page 5: Yet another object system for R

1. Paradigms of programming

2. Existing systems in R

3. Prototype based programming

4. Examples

5. Scoping

6. Uses

Friday, 31 July 2009

Page 6: Yet another object system for R

Programming Paradigms for Dummies

No

Named stateUnnamed state (seq. or conc.)

More

Expressiveness of state

Less

nondeterminism?Observable

Yesfunctional

programming

Descriptive

declarative

programming

Imperative

search

programming

Event!loop

programming

Multi!agent

programming

Message!passing

concurrent

programming

Data structures only

+ unification

Dataflow and

Oz, Alice, Curry Oz, Alice, Curry

CLU, OCaml, Oz

E in one vat

Continuation

programming

Logic and

constraints message passing Message passing Shared state

+ nondeterministic

(channel)

Oz, Alice, Curry, Excel,

AKL, FGHC, FCP

+ synch. on partial termination

FrTime, Yampa

Discrete synchronous

programming

Esterel, Lustre, Signal

Functional reactive

programming (FRP)

Continuous synchronous

programming

Pipes, MapReduce

Nondet. state

Erlang, AKL

CSP, Occam,E, Oz, Alice,

publish/subscribe,tuple space (Linda)

+ clocked computation

Dijkstra’s GCL

+ cell (state)

+ nondet. choice

programming

Imperative

Pascal, C

programming

Guarded

command

choice

Nonmonotonic

dataflow

programming

Concurrent logic

programming

Oz, Alice, AKL

+ port

Multi!agent

dataflow

programming

+ local cell

Active object

programming

Object!capability

programming

Turing complete

Java, OCaml

+ closure

embeddings

+ solver

LIFE, AKL

CLP, ILOG Solver

+ thread+ single assignment

+ thread

Smalltalk, Oz,

+ thread

Java, Alice

+ log

+ cell(state)

Functional

SQL embeddings

Prolog, SQL

+ search

record

XML,S!expression

Haskell, ML, E

(unforgeable constant)

+ cell

Scheme, ML

+ procedure

+ closure

SNOBOL, Icon, Prolog

+ search

(channel)+ port

Scheme, ML

(equality)+ name

+ by!need synchronization

+ by!needsynchronization

+ thread

+ continuation

Lazy concurrent

object!oriented

Concurrent

programming

Shared!state

concurrent

programming

Software

transactional

memory (STM)

Sequential

object!oriented

programming

Stateful

functional

programming

Lazy

declarative

concurrent

programming

programming

Lazy

dataflow

Concurrent

constraint

programming

constraint

programming

Constraint (logic)

programming

Relational & logic

programming

Deterministic

logic programming

synchron.+ by!need + thread

+ single assign.

Haskell

Lazy

functional

programming

Monotonic

dataflow

programming

Declarative

concurrent

programming

ADT

functional

programming

ADT

imperative

programming

Functional

programming

First!order

Figure 2. Taxonomy of programming paradigms

2.1 Taxonomy of programming paradigms

Figure 2 gives a taxonomy of all major programming paradigms, organized in a graphthat shows how they are related [55]. This figure contains a lot of information and re-wards careful examination. There are 27 boxes, each representing a paradigm as a setof programming concepts. Of these 27 boxes, eight contain two paradigms with di!erentnames but the same set of concepts. An arrow between two boxes represents the conceptor concepts that have to be added to go from one paradigm to the next. The conceptsare the basic primitive elements used to construct the paradigms. Often two paradigmsthat seem quite di!erent (for example, functional programming and object-oriented pro-gramming) di!er by just one concept. In this chapter we focus on the programmingconcepts and how the paradigms emerge from them. With n concepts, it is theoreticallypossible to construct 2n paradigms. Of course, many of these paradigms are useless inpractice, such as the empty paradigm (no concepts)1 or paradigms with only one concept.A paradigm almost always has to be Turing complete to be practical. This explains whyfunctional programming is so important: it is based on the concept of first-class function,

1Similar reasoning explains why Baskin-Robbins has exactly 31 flavors of ice cream. We postulatethat they have only 5 flavors, which gives 25 ! 1 = 31 combinations with at least one flavor. The 32nd

combination is the empty flavor. The taste of the empty flavor is an open research question.

13

Friday, 31 July 2009

Page 7: Yet another object system for R

Programming Paradigms for Dummies: What Every Programmer Should Know. Peter Van Roy. http://www.info.ucl.ac.be/~pvr/VanRoyChapter.pdf

P. van Roy and S. Haridi. Concepts, Techniques and Models of Computer Programming. The MIT Press, 2004.

Friday, 31 July 2009

Page 8: Yet another object system for R

S3 / S4 Alternative

Immutable(pass-by-value)

Mutable(pass-by-reference)

Generic functions(function based OO)

Message passing(class based OO)

Friday, 31 July 2009

Page 9: Yet another object system for R

Mutable Immutable

Required by many of most efficient algorithms

Hard to derive computational complexity

Simplifies dependence between components

Must “thread” state

Concurrency hard Concurrency easy

Hard to reason aboutCompilers can make

very efficient

Friday, 31 July 2009

Page 10: Yet another object system for R

Aside: memory efficiency

“A persistent data structure is a data structure which always preserves the previous version of itself when it is modified.”

“While persistence can be achieved by simple copying, this is inefficient in time and space, because most operations make only small changes to a data structure. A better method is to exploit the similarity between the new and old versions to share structure between them, such as using the same subtree in a number of tree structures.”

http://en.wikipedia.org/wiki/Persistent_data_structureFriday, 31 July 2009

Page 11: Yet another object system for R

Existing systems

S3/S4: immutable + generic functions

R.oo : mutable + generic functions

OOP: mutable + message passing

Proto: mutable + message passing

YAROS: mutable + message passing

Friday, 31 July 2009

Page 12: Yet another object system for R

Prototype based programming

Generalisation of class-based oo (like Java) that removes the distinction between classes and instances.

Single dispatch, but often implement multiple (& dynamic) inheritance

Notable languages: Javascript, Io

Friday, 31 July 2009

Page 13: Yet another object system for R

Counter <- Object$clone()$do({ init <- function() self$counter <- 0 count <- function() { self$counter <- self$counter + 1 self$counter }})

counter_a <- Counter$clone()counter_b <- Counter$clone()

counter_a$count()counter_a$count()counter_b$count()

Friday, 31 July 2009

Page 14: Yet another object system for R

Account <- Object$clone()$do({ balance <- 0.0 deposit <- function(v) self$balance <- self$balance + v withdraw <- function(v) self$balance <- self$balance - v show <- function() cat("Account balance: $", self$balance, "\n") init <- function() self$balance <- 0})

Account$show()cat("Depositing $10\n")Account$deposit(10.0)Account$show()

Savings <- Account$clone()$do({ interest <- 0.05 withdraw <- NULL})Savings$show()

Friday, 31 July 2009

Page 15: Yet another object system for R

Observable <- Object$clone()$do({ listeners <- list() add_listener <- function(f) { self$listeners <- c(self$listeners, f) }

signal <- function(...) { for(l in self$listeners) l(...) } init <- function() { self$listeners <- list() }})

counter_a$append_proto(Observable$clone())

Friday, 31 July 2009

Page 16: Yet another object system for R

Advantages

Clean separation of object methods and base functions.

No extra function parameters.

Rich behaviour, including introspection, based on io.

Mutable, multiple inheritance (depth-first search of inheritance graph).

Friday, 31 July 2009

Page 17: Yet another object system for R

Scoping

self$method() needs to have object scoping, not lexical scoping.

That is, rather than looking up based on environment at time function was defined, look it up based on current object.

Friday, 31 July 2009

Page 18: Yet another object system for R

"$.io" <- function(x, i, ...) { res <- core(x)$get_local_slot(i) object_scope(res, x)}

object_scope <- function(res, self) { # Add environment to top of stack that contains # the self object if (is.function(res)) { env <- new.env() env$self <- self parent.env(env) <- environment(res) environment(res) <- env } res}

Friday, 31 July 2009

Page 19: Yet another object system for R

Can simplify code which requires coordinating data from multiple locations: scale code in ggplot2

Complex simulations: e.g. card counting in black jack

When there really is one true underlying object: GUIs and interactive graphics

Uses

Friday, 31 July 2009

Page 20: Yet another object system for R

ConclusionsS3/S4 don’t fulfil every need, and it’s fun to experiment with alternative paradigms.

Prototype-based OO is an interesting idea that the distinction between inheritance and instantiation.

We can actually implement it in R, with different scoping rules. Has been a great learning experience.

Friday, 31 July 2009