The Architecture of Cooperation: Does Code Architecture Mitigate Free-riding in the Open Source...

The Architecture of Cooperation:Does Code Architecture Mitigate Free-riding in the Open Source Development Model?

Carliss Y. Baldwin and Kim B. Clark

© Carliss Y. Baldwin and Kim B. Clark, 2003

Cutting to the chase atthe end of the paper…

What does a collectively produced artifact like Linux mean for a competing

product like Windows™?


A Collective Open Development Process vs.a Proprietary Closed Development Process

Assume: one firm, one collective, j modules Equal quality output Developer-users will purchase proprietary system instead

of coding modules for a collective system if the price of proprietary system < their cost of coding

Cost per developer of collective system with j modules= c/j (for Workers, 0 for free-riders)

So Max proprietary system price = c/j So Max revenue = Nc/j (N is number of customers) Cost of creating proprietary system comparable to

collective system= c if no option value;

= kj*c if option value (we show this…)


Horse race

Under these assumptions, we show: NPV of the proprietary codebase

= Nc/j – c if codebase is modular

= Nc/j – kj*c if codebase is modular and has option value; and

If N ≤ kj*j no commercial

opportunity exists (though sunk cost assets may

survive)

kj* and j are consequences of code architecture; kj* is increasing, concave in j


Related Literature—Vast Eric Raymond

– Software is a non-rival good; cost of revelation– “Scratching an itch”– “Reputation game”– “Enough eyeballs”

Rishab Ghosh (“cooking pot”, generalized exchange) Lerner and Tirole (simple economics, reputation=>Wealth) Justin Pappas Johnson (“public provision of private goods”) Harhoff, Henkel and von Hippel (“free revelation”) James Bessen (users benefit from a customizable codebase) Von Hippel and von Krogh, O’Mahony, Benkler (this is a

new institutional/organizational form)


We position our work in a different economics literature

We start with a (specific) production process and a related (non-arbitrary) production function

Go on to derive/deduce the institutional structures that can support the production process, hence “deliver the value” of the production function– “New Institutional Economics”

» “Institutional Structure of Production” or ISP

Aoki-Hurwicz-Greif institutions– An institution is an equilibrium of a set of linked

games, plus summary beliefs that are self-confirming as the play unfolds.


“Start with a production process” Formalized for economic analysis as a production

function Ask “What does the process need?”

– To function at all (thresholds)

– To function as an equilibrium of linked games

– To function efficiently

“Bricks are expensive,” so – Which bricks are essential?

– Which additional bricks have high NPVs?

– Ivy may be expensive, too, it’s an open question


A production function for design processes

We assert— The production function of any design process can be written as:

V = I(t){V(Min) + max Vj(kj ; j )} – Costs j kj

The specifics of this function are determined by the architecture of the design = architecture of the system

Vs are often recursive—> modules within modules

Process/function can be extensive, but IS MAPPABLE

ThresholdsA Minimal System

Modules/OptionsCosts


If this function is descriptively true, we should be able to derive the institutions needed to sustain design processes as equilibria of linked games (plus summary beliefs) about instances of this function

An Instance = An Architecture


Open Source is—

Social Movement

Free Software

New System of Property Rights

GNU GPL

A Bunch of Organizations + Governance Structures

Many Software Development Processes—

One Method?

Complementary Institutional Structure(s)


Software Development Processes Are Design Processes

Social Movement

Free Software

New System of Property Rights

GNU GPL

A Bunch of Organizations + Governance Structures

Many Software Development Processes —

One Method?

Design Processes


Open Source is a set of complementary institutional structures that sustain lots of design processes—

=> A “test” of our production function thesis

=>Prediction:

Open Source institutional structures should arise as equilibria of some specifications of our function, ie, for some architectures of codebases.


Modular StructureOption Value

Two Properties of Code Architecture


Modularity

Module = a set of tasks – separable from others; – with additive incremental value– Unit of design substitution– No. of modules = j

Global Design Rules

Module A Module B Module C Module D


Modularity

Applies to groups of tasks. Modular in design ≠ Modular at runtime

– Linux is modular in design but monolithic at runtime.

» So is Unix

– Minix is modular at runtime, but (arguably) monolithic in design.


Option Value

Design process is a search under uncertainty

Design substitution is optional Versions are evidence of option values

being realized over timeGlobal Design Rules v.1

Version 1.0Version 1.2

Version 1.5Version 1.8


Modularity and Option Values are “architectural properties” because

(1) They are observable in early and incomplete code releases; and

(2) They affect the way the codebase evolves, ie., gets built out


How Modularity and Option Value Work — Intuition/Analogy Cooking dinner (Rival good: lot size = 12 portions)

– One big stew = Not modular, no option value» A cook has no incentive to join with other cooks

– Meat, salad, dessert = Modular, j=3» Three cooks have incentives to get together

– Two different stew recipes = Option value, > 0» Two cooks, pick the best recipe after the fact

– Three courses, two recipes per course = Modules with option value

» Six cooks will voluntarily join up, cook, and feed each other» May feed an additional 6-18 people (free riders)

Collective Church recipe book (Non-rival good) – Contributions = #courses x #recipes per course


Open Source Development Process

1 2 3 4 5 6 7 8Design Contribute Code Post Integrate Test Report Bugs Patch

This paper looks at the early stages, only.

“Involuntary Beneficence”

Decision to join and work or free-ride

+“Voluntary Revelation”

Decision to publish code, comments, etc.

Suggests that those stages of the process can be characterized in terms of two linked games.


The Two Linked Games

Work (write code, patch, etc.)

Reveal (publish code, comments, etc.)

/* bitmap.c contains the code that handles the inode and block bitmaps */#include <string.h>

#include <linux/sched.h>#include <linux/kernel.h>

#define clear_block(addr) \__asm__("cld\n\t" \

"rep\n\t" \"stosl" \::"a" (0),"c"

(BLOCK_SIZE/4),"D" ((long) (addr)):"cx","di")


First game—“Involuntary Beneficence” Decision 1:

– Join a collective development process; or

– Code in isolation

If a developer joins and works, his/her work product will automatically benefit other joiners (who may be free-riding). Standard, convenient assumption.

Decision 2: Within collective, – Work; or

– Free-ride

“Private provision of public goods” game


First Game—Formal Setup Non-rival good—agents’ outside alternative is to code

alone, involuntary revelation=>free-riding Two-stage (join/work), one round. Work interval equals

the time needed to code one module; All work synchronous.– Or endogenous sequences to exhaust modules/option value

Subgame perfect Nash equilibrium– Sequential or simultaneous entry– Pure, mixed and evolutionarily stable strategies

Code Architecture visible to agents– Some number of symmetric modules, j ≥ 1– Value per module = v/j; Cost per module = c/j– Some option value per module ( ≥ 0) – “Perfect” and “imperfect” information

Number of workers is the outcome of equilibrium


First Game—Results

If codebase is NOT modular and has NO option value, a working developer does just as well coding in isolation as joining the collective.

If codebase is modular OR has option value, working developers do better in the collective that coding alone.

Modularity and option value are economic complements: more of one makes more of the other more valuable (Baldwin and Clark, Design Rules, 2000)


The Equilibrium Number of Working Developers in a Game of Involuntary Beneficence as a Function of Cost-to-Value Ratio, c/v, and Number of Modules, j (Imperfect Information, Redundant Effort)

No. ofModules

10% 20% 30% 40% 50% 60% 70% 80% 90% 100%1 1 1 1 1 1 1 1 1 1 02 4 3 2 2 1 1 1 1 1 03 6 4 3 3 2 2 1 1 1 04 9 6 5 4 3 2 2 1 1 05 11 8 6 5 4 3 2 1 1 06 13 9 7 6 4 3 2 2 1 07 15 11 8 6 5 4 3 2 1 08 18 13 10 7 6 4 3 2 1 09 20 14 11 8 6 5 4 2 1 0

10 22 16 12 9 7 5 4 3 1 011 25 17 13 10 8 6 4 3 2 012 27 19 14 11 8 6 5 3 2 013 29 21 16 12 9 7 5 3 2 014 32 22 17 13 10 7 5 4 2 015 34 24 18 14 11 8 6 4 2 016 36 25 19 15 11 8 6 4 2 017 38 27 20 16 12 9 6 4 2 018 41 29 22 17 13 9 7 4 2 019 43 30 23 17 13 10 7 5 2 020 45 32 24 18 14 10 7 5 3 0

Cost/Value per Module


The Number of Developers Working in Equilibrium, nj*, as a Function of the Cost-to-Technical-Potential Ratio, c/, and the Number of Modules, j (Perfect Information, Option-driven Effort)

No. of Cost/Technical PotentialModules

25% 50% 75% 100% 150%1 2 0 0 0 02 6 2 0 0 03 12 3 0 0 04 16 8 4 0 05 25 10 5 0 06 30 18 6 0 07 42 21 7 7 08 48 24 16 8 09 54 27 18 9 0

10 70 30 20 10 011 77 44 22 11 012 84 48 24 12 013 104 52 26 26 014 112 56 42 28 015 120 60 45 30 1516 128 64 48 32 1617 153 85 51 34 1718 162 90 54 36 1819 171 95 57 38 1920 180 100 60 40 2021 189 105 63 42 2122 220 110 66 44 2223 230 115 92 46 2324 240 120 96 72 2425 250 150 100 75 25

Note: E(v) =.399

Ten developers on each module


Second Game—“Voluntary Revelation”

In real life, developers do not have to reveal their code to one another

Suppose two developers each have coded a module (sunk cost)

Can send it to the other, but communication is costly

One bears a cost to benefit the other This is a canonical Prisoners’ Dilemma game


There are many ways to encourage cooperation in a Prisoners’ Dilemma game (Axelrod)

Reduce the cost of communicating– Internet, email, newgroups

Increase the rewards– Desire to reciprocate, feelings of altruism (Benkler)– The “Reputation Game” (Lerner-Tirole)

Create a repeated game– Contingent strategies (eg. Tit-for-Tat)– Can support cooperation in equilibrium


Code Architecture interacts with the Prisoners’ Dilemma Game

Modularity – reduces the cost of a unit of contribution– creates many different “chunks of reputation”– creates larger “space” of repeatable games

Option value – provides improvable modules, thus creates

“contests with winners” (reputation)– makes the arrival of the end-game a surprise


The effect of linking the two games

Reputation/repetition only has to overcome the cost of communicating (r/j)

“Work” motivated by the value of own code:– v/j > c/j for the developer

“Joining” motivated by access to others’ code (non-rival good)

Potentially very large continuation value: V – c/j – r/j + Rj for each developer


Summary: This paper Characterizes software as a “non-rival” good Characterizes Open Source Development in terms

of two linked games with three stages (join, work, reveal)

Interacts games with code architecture Looks at Nash equilibria vs. “Robinson Crusoe”

alternative (coding alone) Defines a voluntary collective development

process as sustainable if the equilibrium payoff to Workers is greater than Robinson Crusoe payoff


Conclusions: A Voluntary, Collective Design Process Requires—

For existence:– Designer-users– Non-rivalrous goods– A design architecture with modules and/or options– Communication speeds matching the design interval for one module– Methods of SYSTEM INTEGRATION AND TESTING (omitted here

—see DR1 and Bessen) For efficiency:

– Ways to know who’s working on what– Ways to know which module design is better or best (Module-level

testing—see DR1, contrast to Bessen) For robustness (to solve the Prisoners’ Dilemma game):

– Rewards for communication– Iteration with an indeterminate horizon (not strict repetition)


A Collective Open Development Process vs.a Proprietary Closed Development Process

Assume: one firm, one collective, j modules Equal quality output Developer-users will purchase proprietary system instead

of coding modules for a collective system if the price of proprietary system < their cost of coding

Cost per developer of collective system with j modules= c/j (for Workers, 0 for free-riders)

So Max proprietary system price = c/j So Max revenue = Nc/j (N is number of customers) Cost of creating proprietary system comparable to

collective system= c if no option value;

= kj*c if option value (we show this…)


Horse race

Under these assumptions, we show: NPV of the proprietary codebase

= Nc/j – c if codebase is modular

= Nc/j – kj*c if codebase is modular and has option value; and

If N ≤ kj*j no commercial

opportunity exists (though sunk cost assets may

survive)

kj* and j are consequences of code architecture; kj* is increasing, concave in j


Thank you!

The Architecture of Cooperation: Does Code Architecture Mitigate Free-riding in the Open Source...

Documents

Transcript of The Architecture of Cooperation: Does Code Architecture Mitigate Free-riding in the Open Source...