Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus...

31
Group selection and social preferences Jörgen Weibull and Marcus Salomonsson Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April 15, 2005 Abstract Suppose that a large number of individuals are randomly matched into groups where each group plays a nite symmetric game. Individuals breed true accord- ing to their individual material payos, but the expected number of surviving ospring may depend on the material payovector to the whole group. We show that the mean-eld equation for the induced population dynamic is equivalent to the replicator dynamic for a game with payos derived from those in the origi- nal game. We apply this selection dynamic to a number of examples, including prisoners’ dilemma games, coordination games, hawk-dove games, a prisoners’ dilemma with a punishment option, and common-pool games. For each of these, we provide conditions under which our selection dynamic leads to other outcomes than those obtained under the usual replicator dynamic. By way of a revealed- preference argument, we show how our selection dynamic can explain certain stable behaviors that are consistent with individuals having social preferences. Keywords: Group selection, social preferences, altruism, fairness. We thank Milo Bianchi, Olof Leimar and participants in the conference on evolutionary game dynamics at PED, Harvard, November 2004, for comments to an earlier draft of this manuscript. We are also grateful to Bill Sandholm who provided the software we used to construct Figures 4 and 5, available at http://www.ssc.wisc.edu/~whs/. Marcus Salomonsson thanks the Wallenberg Foundation for nancial support of his research. Corresponding author. E-mail: [email protected]. Phone: +46 8 736 92 04. Fax: + 46 8 31 32 07. E-mail address: [email protected]. 1

Transcript of Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus...

Page 1: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

Group selection and social preferences∗

Jörgen Weibull† and Marcus Salomonsson‡

Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden

April 15, 2005

Abstract

Suppose that a large number of individuals are randomly matched into groups

where each group plays a finite symmetric game. Individuals breed true accord-

ing to their individual material payoffs, but the expected number of surviving

offspring may depend on the material payoff vector to the whole group. We show

that the mean-field equation for the induced population dynamic is equivalent to

the replicator dynamic for a game with payoffs derived from those in the origi-

nal game. We apply this selection dynamic to a number of examples, including

prisoners’ dilemma games, coordination games, hawk-dove games, a prisoners’

dilemma with a punishment option, and common-pool games. For each of these,

we provide conditions under which our selection dynamic leads to other outcomes

than those obtained under the usual replicator dynamic. By way of a revealed-

preference argument, we show how our selection dynamic can explain certain

stable behaviors that are consistent with individuals having social preferences.

Keywords: Group selection, social preferences, altruism, fairness.

∗We thank Milo Bianchi, Olof Leimar and participants in the conference on evolutionary gamedynamics at PED, Harvard, November 2004, for comments to an earlier draft of this manuscript. Weare also grateful to Bill Sandholm who provided the software we used to construct Figures 4 and 5,available at http://www.ssc.wisc.edu/~whs/. Marcus Salomonsson thanks the Wallenberg Foundationfor financial support of his research.

†Corresponding author. E-mail: [email protected]. Phone: +46 8 736 92 04. Fax: + 46 8 3132 07.

‡E-mail address: [email protected].

1

Page 2: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

1 Introduction

One of the longest standing controversies in evolutionary game theory has been the

group selection controversy. The group selection idea, which traces its origins all the

way back to Darwin, essentially says that groups with internal cooperation will be more

successful than other groups, and that this may cause altruistic behaviors – individual

sacrifices for the common good of the group – to survive and in some circumstances

thrive:

“There can be no doubt that a tribe including many members who, from pos-

sessing in a high degree the spirit of patriotism, fidelity, obedience, courage,

and sympathy, were always ready to give aid to each other and to sacri-

fice themselves for the common good, would be victorious over most other

tribes, and this would be natural selection." (Darwin, 1871, page 166.)

The controversy was long believed to have been finally settled after an exchange

between Wynne-Edwards (1962) and Maynard Smith (1964). The exchange was ignited

by Wynne-Edwards, who argued in favor of group selection. His argumentation was

informal and based on examples. In response, Maynard Smith argued that Wynne-

Edwards’ examples were explicable without reference to group selection, and then went

on to formulate what a more precise model of group selection might look like. Based on

this model sketch, called the haystack model, Maynard Smith dismissed group selection.

In the haystack model, groups are randomly reshuffled at given time intervals. Be-

tween each such reshuffle, a one-shot prisoners’ dilemma game is played recurrently in

every group. A crucial assumption is that the population state in every group converges

to a limit state before groups are reshuffled. The process is thus adiabatic: individual

selection within groups is an order of magnitude faster than group selection. This fea-

ture of the model implies that all cooperators in mixed groups become extinct before

it is time to reshuffle the groups. Only cooperators in groups that consist exclusively

of cooperators will survive. The fact that such groups must be pure, and that they

must stay isolated for long periods of time, led Maynard Smith to conclude that cir-

cumstances for group selection to be effective were so special that group selection was

unlikely to play an important role.

Despite the fact that this model suggested that group selection was only unlikely,

not impossible, it was viewed as a sounding rejection of the concept. Consequently, after

Maynard-Smith’s and Wynne-Edwards’ exchange, and after a passionate criticism of

2

Page 3: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

the concept by Williams (1966), the group selection idea all but disappeared from the

evolutionary literature.1 When it was mentioned, it was rather as a cautionary tale

of how evolutionary selection does not work. In later years, however, group selection

has had a vivid revival. The literature has in fact become much too large to be fairly

treated here. Surveys of the group selection literature are given in Bergstrom (2002)

and Wilson and Sober (1994), and recent contributions, with extensive discussions, are

given in Kerr and Godfrey-Smith (2002) and Henrich (2003).

The aim of this study is not to provide arguments for or against group selection, but

instead to suggest a parsimonious, operational and simple population selection model

that allows for both individual and group selection, without the adiabatic assumption

of the haystack model. In a nutshell, our model is as follows. A large population of

individuals are randomly matched into groups. The interaction in each group takes the

form of a finite symmetric game. The game can be simple or complex, and may consist

of one or many stages – as, for example, in finitely repeated games. All individuals

play pure strategies in their group. The play of the game in a group results in material

payoffs to the group members. Each individual breeds true, and the expected number

of offspring depends on the individual’s own material payoff. All offspring are subject to

an exogenous hazard, such as infectious diseases, harsh weather conditions, or attacks

from predators. The expected share of survivors among the offspring in a group may

depend on all material payoffs in the group. As a canonical example of this, we will

assume that the expected share of survivors is proportional to the sum of material

payoffs. However, we will also consider other functional forms, such as the minimum or

product of material payoffs in the group. The key assumption in our model is weaker:

all that matters is that the expected number of surviving offspring may depend, in part,

on other group members’ material payoffs. Such dependence seems likely in situations

where groups, and their offspring, stay together for some length of time. We here take

this dependence as a primitive, although this, in its turn, may have arisen from the

interplay between material production and reproduction conditions as well as forms

of social interaction, “institutions”, and habits, of the population under study. For

instance, as humans turned from hunting and gathering to agriculture, the form of

dependence most likely changed.

We show that the mean-field equation for the induced stochastic population process

is identical with the Taylor and Jonker (1978) replicator dynamic for a certain derived

game. The payoffs in this derived game are functions of the vector of individual material

1Wilson (1983) gives a more detailed description of this period, from a proponent’s point of view.

3

Page 4: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

payoffs. Relying on established results for the replicator dynamic, predictions for long-

run population states can then be made. In particular, if the dependence of survival

probabilities on other group members’ material payoffs is sufficiently strong, cooperation

among group members may emerge in the long run.

We illustrate the implications of our approach by way of a number of examples,

including prisoners’ dilemma games, coordination games, hawk-dove games, a prisoners’

dilemma with possibility to punish a defector, and common-pool games. In particular,

for each of these games we provide conditions under which our selection dynamic leads

to other outcomes than those obtained under pure individual selection. As expected,

the effect of group selection is to promote behaviors that benefit the common good for

the group. However, in some games, and for certain survival functions, the effect is too

weak to cause any change of the long-run outcome.

Utility theory in economics is based upon a revealed-preference principle; human

behavior is interpreted as the result of rational choice according to some underlying

binary preference relation over outcomes, or, more generally, lotteries over outcomes.

If choice behaviors meet certain regularity conditions with respect to variations of the

set of alternatives, there exists a utility function for the decision-maker such that his

or her behavior is consistent with the maximization of the expected value of that func-

tion. Such a mathematical representation allows for powerful analysis and prediction

of behaviors in new environments.

By way of a similar revealed-preference argument, here applied to the asymptotic

population behavior under our selection dynamic, we argue that the results of selection,

in some situations, allow for the interpretation that individuals are rational decision-

makers with utility functions given by the payoffs of the derived game, and even that

this rationality and those preferences are common knowledge among all individuals in

the population. If aggregate population behavior converges in our selection dynamic,

then the limit population state will correspond to a symmetric Nash equilibrium of the

derived game. It is then as if individuals, on top of the above-mentioned rationality, had

consistent expectations as each others’ behaviors. Moreover, the payoffs in the derived

game in general depend on all players’ material payoffs, so the revealed preferences are

“other-regarding” or “social” – typically combining a concern for one’s own material

payoff with some concern for the material well-being of others. In this limited sense,

the present model provides an evolutionary underpinning of the hypothesis of game

theoretic rationality combined with social preferences – a common hypothesis in much

of modern “behavioral” game theory. Our approach also suggests a certain class of

4

Page 5: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

social preferences, to the best of our knowledge not studied before, where an individual’s

utility is the product of an individualistic utility function and a social welfare function.

The rest of the paper is organized as follows. The model is formalized in section 2,

applied to examples in section 3. Section 4 discusses briefly the evolutionary asymme-

try between rewards and punishments. Implications for “as if” rationality and social

preferences are discussed in section 5. Related literature is discussed in section 6, and

section 7 concludes.

2 Model

Consider a finite and symmetric two-player gameGwith pure strategy set S = {1, 2, ...,m}and payoff matrix Π = (πhk), where πhk is the material payoff to pure strategy h ∈ S

when played against pure strategy k ∈ S. Let u (x, y) denote the expected material

payoff to mixed strategy x ∈ ∆ (S) when played against mixed strategy y ∈ ∆ (S):

u (x, y) =Xh∈S

Xk∈S

xhπhkyk (1)

Suppose this game is played recurrently in randomly matched groups of size 2, drawn

from a finite population in which every individual is “programmed” to play a certain

pure strategy. Let N (t) be the population size at time t, and for each pure strategy

h ∈ S, let Nh (t) be the number of “h-strategists” in the current population. At times

t = 0,∆, 2∆, ..., where ∆ > 0, [N (t) /2] groups of size 2 are randomly formed.2 For

each individual, all matches with others are equally likely.

In each such time period, every group plays the game G once, and each individual

breeds true; all offspring inherit their single parent’s pure strategy.3 The expected

number of surviving offspring to a h-strategist in a group where the other member

plays k ∈ S is ∆ · φ (πhk, πkh), where φ : R2→ R+. Hence, φ (πhk, πkh) is the fitnessof pure strategy h against pure strategy k. At the end of each period, all surviving

individuals from all groups are brought together, and a fixed fraction ∆ · γ ≥ 0 die,where γ ≥ 0 is the common death rate (this rate turns out to play no role). We have

2Here [x] denotes the integer part of a real number x, the largest integer not exceeding x. If N (t)is odd, then one individual is not assigned to any group. The focus is here on large populations, and itis then immaterial what happens to the left-out individual. For the sake of definiteness, assume thatsuch an individual does not reproduce.

3The time period ∆ may be long, say a year, and the interaction may take the form of a finitelyrepeated game, say a stage game played each day of the year.

5

Page 6: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

in mind, as a canonical example, multiplicative fitness functions where the first factor

is an increasing function of own material payoff, representing the number of offspring,

and the second is an increasing function of some aggregate of both group members’

material payoffs, representing the survival probability of each offspring in the group.

This model easily generalizes to a single population playing a finite and symmetric

n-player game (see below), and also to n populations, one for each player role, playing

a finite n-player game. In all these cases, a group is defined as a random match between

n individuals. In the first case, the fitness function φ : Rn→ R+ is symmetric in thesense that it is invariant under permutations of other players’ payoffs.

2.1 The induced selection dynamic

The mean-field equation for the induced stochastic population process can be derived

as follows. Assume that the population is non-extinct in some period t ∈ {0,∆, 2∆, ...}.For every pure strategy h ∈ S, let xh (t) denote the population share of h-strategists:

xh (t) = Nh (t) /N (t). For each pure strategy h and an even numberN (t) of individuals,

the expected number of h-strategists in the next period is

E [Nh (t+∆) | N1 (t) , ..., Nm (t)] =

=

Ã1−∆γ +∆

Xk∈S

Nk (t)− δhkN (t)− 1 φ (πhk, πkh)

!Nh (t) , (2)

where δhk is Kronecker’s delta.4 For N (t) large, we thus have

E [Nh (t+∆) | N1 (t) , ..., Nm (t)]−Nh (t)

∆≈

≈"Xk∈S

xk (t)φ (πhk, πkh)− γ

#Nh (t) . (3)

Taking the limit ∆ → 0, and dividing through by N (t), we obtain the mean-field

equation

xh =£u¡eh, x

¢− u (x, x)

¤xh, (4)

4That is, δhk = 1 if h = k, otherwise δhk = 0. For N (t) odd, the denominator is N (t)− 2 insteadof N (t)− 1.

6

Page 7: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

where eh is the unit vector in direction h (let ehk = 0 for all coordinates k 6= h and

ehh = 1) and u : [∆ (S)]2 → R+ is defined by

u (x, y) =Xh∈S

Xk∈S

xhφ (πhk, πkh) yk. (5)

The function u is the derived payoff function associated with the pure-strategy payoff

matrix Π = (πhk), where

πhk = φ (πhk, πkh) . (6)

The selection dynamic (4) is thus nothing but the Taylor and Jonker (1978) repli-

cator dynamic for the derived game. In the special case when fitness is a positive

affine function of own material payoff only, φ (πhk, πkh) ≡ α + βπhk for some β > 0,

the selection dynamic (4) is proportional to the standard replicator dynamic, and thus

has identical solution orbits. The present model hence contains the usual model of

individual selection as a special case.

2.2 2× 2 gamesApplied to a symmetric 2× 2 game, our approach gives

Π =

Ãφ (π11, π11) φ (π12, π21)

φ (π21, π12) φ (π22, π22)

!. (7)

The best-reply correspondence, weak and strict dominance, risk dominance, and the

replicator dynamic are all unaffected by the addition or subtraction of a constant to a

column of the payoff matrix (see e.g. Weibull (1995)), so the derived game is equivalent

in these respects with the normalized derived game

Π =

Ãφ (π11, π11)− φ (π21, π12) 0

0 φ (π22, π22)− φ (π12, π21)

!. (8)

This game, and hence also the derived game, is a (strict) coordination (CO) game if

both diagonal entries are positive, a (strict) hawk-dove (HD) game if both diagonal

entries are negative, and a (strictly) dominance-solvable (DS) game if the diagonal

entries have opposite signs. In the case of a CO-game, the pure-strategy pair (1, 1)

strictly risk dominates the pure-strategy pair (2, 2) if and only if the first diagonal

entry, π11, exceeds the second, π22. In case of a DS-game, the derived game Π, but not

7

Page 8: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

necessarily the normalized derived game Π, is a prisoners’ dilemma (PD) game if and

only if the dominant pure strategy earns less against itself than the other pure strategy

earns against itself, in terms of derived payoffs. In the opposite case, we call the derived

game Π an efficient dominance-solvable (ED) game.

The replicator dynamic in a generic symmetric 2×2 game converges from all initialstates. Moreover, the limit point is a best reply to itself (the strategy of a symmetric

Nash equilibrium) if the initial state is interior. Strict CO-games have two attractors

– the whole population playing one of the two pure strategies – and their basins of

attraction are separated by the unique (but unstable) mixed Nash equilibrium strategy.

Strict HD-games have one attractor, the population mix defined by the unique mixed

Nash equilibrium strategy. Strict DS-games, finally, also have a unique attractor – the

whole population playing the dominant pure strategy – irrespective of whether this is

socially efficient or not.

3 Examples

We illustrate the present selection model by way of a few examples.

3.1 A family of 2× 2-gamesConsider symmetric 2× 2-games with material payoffs

Π =

Ã2 a

b 1

!, (9)

for arbitrary constants a and b. Such a game is a HD-game when a > 1 and b > 2, a

ED-game when a > 1 and b < 2, a CO-game when a < 1 and b < 2, and a PD-game

when a < 1 and b > 2. These conditions cut the (a, b)-plane into four regions, oriented

clockwise around the point (1, 2), see the straight cross in Figure 1 below.

Suppose that fitness is bilinear in own material payoff and in the group’s material

payoff sum. With vi denoting own material payoff and v−i that of the other individual:

φ (vi, v−i) = vi (vi + v−i) . (10)

Such a fitness function arises if the expected number of offspring is proportional to own

payoff and the survival probability of all offspring is proportional to the group’s total

8

Page 9: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

resources. The derived game is then

Π =

Ã8 a (a+ b)

b (a+ b) 2

!. (11)

The two curves in Figure 1 divide the (a, b)-plane into four regions that determine

the nature of the derived game. The regions are oriented in the same way as for

the original game: HD-games being located north-east, CO-games south-west, and

dominance solvable games south-east (ED-games) and north-west (PD-games) of the

two curves’ intersection.

21.81.61.41.210.80.60.40.20

4

3

2

1

0

a

b

a

b

Figure 1: Parameter combinations (a, b) and the nature of the two games Π (straight

lines) and Π (curves).

We see, in particular, that if the game Π, defined in terms of material payoffs, is

a PD game, then the derived game can be any one of the four generic game types.

Suppose, for example, that a = 0.8 and b = 3. In this case, the long-run population

state is a certain interior state – the mixed strategy in the derived HD-game– for

all interior initial states. As another example, suppose a = 0.6 and b = 2.4. Then the

derived game is a CO-game. Hence, the long-run population state depends on the initial

state. In particular, if there are sufficiently many cooperators in the initial population

state, then defectors will be asymptotically wiped out from the population – although

the game is a PD-game in terms of material payoffs. In the presence of perpetual

random mutations, as modelled in Kandori, Mailath and Rob (1993), the population

process becomes ergodic, and its invariant distribution places virtually all probability

9

Page 10: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

mass on the population state where all individuals cooperate if the mutation rate is

low. This follows from the observation that the (C,C) equilibrium risk dominates the

(D,D) equilibrium in the derived game for these parameter values.5

3.2 Punishing defectors and rewarding cooperators

There is experimental evidence, see Fehr and Gächter (2002), that human subjects

punish defectors in public-goods provision interactions, even when such punishment

is costly to the punisher. The threat of such punishment enhances cooperation and

hence welfare in interacting groups of human subjects. However, to implement such

punishment violates individual sequential rationality, as applied to the material payoffs

of the game. Can punishment behaviors be explained by the present model? Figure

3 below shows a game-theoretic representation of public-goods provision that allows

cooperators to punish defectors.

1

2

C

D

P N

C

C

D

D

N P

22

11

a-cb-d

ab

ba

b-da-c

1 2

Figure 3: A two-stage prisoner’s dilemma game with punishment option.

The first stage of this game is a simultaneous-move prisoners’ dilemma, where each

player chooses C or D, with material payoffs according to (9), for a < 1 and b > 2. In the

5To see this, note that the normalized derived game has diagonal elements 0.8 and 0.2, respectively.

10

Page 11: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

second stage, a player who cooperated in stage one has the option to punish defection

by the other player. The cost of punishing is c > 0 and the effect of punishment is a

reduction of the punished’s payoff by d > 0.

The unique subgame perfect equilibrium of this extensive-form game is not to punish

– since this reduces the punisher’s material payoff – and hence for both players to

defect in the first stage. In the normal form of this symmetric two-player game, each

player has four pure strategies 1=CN, 2=CP, 3=DN and 4=DP at his or her disposal,

and the payoff matrix of this symmetric game is

Π =

⎛⎜⎜⎜⎜⎝2 2 a a

2 2 a− c a− c

b b− d 1 1

b b− d 1 1

⎞⎟⎟⎟⎟⎠ .

Not surprisingly, the pure strategy CN weakly dominates strategy CP. However, if the

punishment is not too harsh, d < b − 2, then each of the two behaviorally equivalentstrategies DN and DP strictly dominates CN (and hence also CP). In such cases, the

game has a unique Nash equilibrium component, where both players play arbitrary

mixes between DN and DP, and this component attracts all interior solution orbits of

the replicator dynamic. However, if punishment is harsh, d > b−2, there exists anothercomponent of symmetric Nash equilibria, namely all mixed strategies pCN+(1− p)CP

with

p ≤ 1− (b− 2) /d (12)

are then best replies to themselves. A large set of initial population states lead asymp-

totically to the set, see solution orbits in Figure 4 below, computed for a = 0.5, b = 3,

c = 0.5 and d = 1.5, and hence p ≤ 1/3, and where “D” stands for the sum of the

population shares playing DN and DP.

11

Page 12: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

CN

CP D

Figure 4: Solution orbits to the replicator dynamic for the game defined in terms of

material payoffs.

All points in the cooperative Nash equilibrium component, except from its end-

point, 13CN + 2

3CP , are Lyapunov stable (small pushes do not lead far away), but

the component is not an attractor since its end-point is unstable. The reason for the

relative persistence of these equilibria is that near this equilibrium component defections

are rare and hence punishment not very costly (CP gives almost the same expected

payoff as CN). Hence, defectors “learn” that defections are likely to be followed by

punishments, and punishers “learn” that CP is somewhat more costly than CN. These

two adaptations occur at comparable rates in the selection dynamic, and hence the

population state moves back toward the cooperative equilibrium component, except

when the population state is close to the end-point of the equilibrium component.

Similar dynamic phenomena have been observed in Binmore and Samuelson (1999),

in the context of ultimatum bargaining and in Sethi and Somanathan (1996) for the

tragedy of the commons.

The derived payoff matrix is

12

Page 13: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

Π =

⎛⎜⎜⎜⎜⎝φ (2, 2) φ (2, 2) φ (a, b) φ (a, b)

φ (2, 2) φ (2, 2) φ (a− c, b− d) φ (a− c, b− d)

φ (b, a) φ (b− d, a− c) φ (1, 1) φ (1, 1)

φ (b, a) φ (b− d, a− c) φ (1, 1) φ (1, 1)

⎞⎟⎟⎟⎟⎠ .Suppose that the fitness function φ is strictly increasing in both arguments, that

is, that higher material payoff to any one of the group members increases the expected

number of surviving offspring to both members. The pure strategy CP is clearly weakly

dominated by strategy CN also in the derived game. In order to high-light the role of

the punishment option, we henceforth focus on cases where

φ (a, b) < φ (1, 1) and φ (b, a) > φ (2, 2) , (13)

that is, where the derived game would be a PD-game in the absence of the punishment

option.

Strategies DN and DP do not strictly dominate CN and CP in the derived game if

φ (b− d, a− c) ≤ φ (2, 2) , (14)

that is, if the punishment is sufficiently harsh and/or the cost of punishing sufficiently

high. This is, thus, a qualitative difference between the games defined in terms of

material and derived payoffs, respectively.6

Under (13) and (14), all mixed strategies pCN + (1− p)CP with

p ≤ φ (2, 2)− φ (b− d, a− c)

φ (b, a)− φ (b− d, a− c)(15)

are best replies to themselves. Hence, not surprisingly, there is a “cooperative” sym-

metric Nash equilibrium component also in the derived game. Indeed, we would expect

this component to be larger, and attract a larger set of initial population states, than

in the game based on individual material payoffs. It is not difficult to confirm this

conjecture under bilinear fitness (10), granted b > d and a + b > c + d, inequalities

that are met in the above numerical example.7 In this sense, group selection makes

cooperation “more common.” Solution orbits to the replicator dynamic in the derived

6With bilinear fitness (10), conditions (13) and (14) are, for example, met in the numerical examplein Figure 4.

7The right-hand side in (15) exceeds that in (12) iff 8d+ (b− 2) (b− d) (c+ d)− 2d (a+ b) > 0.

13

Page 14: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

game are shown in Figure 5 below, based on the bilinear fitness function and the same

parameter values as in the preceding diagram.

CN

CP D

Figure 5: Solution orbits to the replicator dynamic for the derived game.

The end-point of the cooperative equilibrium component has moved up from p = 1/3

to p ≈ 0.7.8 We also see that the basin of initial states that tend towards the cooperativecomponent has increased significantly. In the presence of random mutations, however,

the unique outcome in the “ultra long run” is still that everyone defects.

3.3 Defence of a common resource pool

We finally apply our model to an example discussed in Boyd and Richerson (1985).

Consider, thus, a group of n individuals who have a common pool of resources, say

a herd of domestic animals, where each group member owns one n:th of the pool.

The total pool is worth w, thus w/n for each group member. The pool is occasionally

exposed to some hazard– for example a predator or tempest. Group members therefore

take turns to guard it. The guard has a binary choice in case the hazard materializes:

8The right-hand side in (15) becomes 8−2.2510.5−2.25 ≈ .697.

14

Page 15: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

either to defend the common pool at a cost c to him- or herself, and thereby save the

pool, or not to act (at zero cost to him- or herself) in which case the amount d of the

pool will be lost to the group, where 0 < d < w. If d/n < c < d, as we here assume, it

is in the group’s interest that the guard defends the pool: the expected total material

payoff to the group is then w − pc, where p is the probability of the hazard, while the

expected total material payoff to the group is otherwise w − pd. However, the guard

has no material incentive to defend the pool in case the hazard materializes: his or her

material payoff when defending w/n− c, which is less than (w − d) /n. In other words,

the expected payoff to the group is maximized if every member, when on guard, defends

the resource, but it is individually rational, as defined in terms of own material payoffs,

for the guard not to act in case the hazard materializes.

Such situations are of the type alluded to in the introductory quote by Darwin, and

can be modelled as finite and symmetric n+ 1-player games, where players i = 1, ..., n

are the group members and player 0 is “nature” who randomly selects one of the group

members as guard, with probability 1/n for each individual to be so selected. Nature

also makes a second random draw, statistically independent from its first draw, namely

whether or not the hazard will materialize. Let p be the hazard probability, where

0 < p ≤ 1. Each personal player thus has available two pure strategies, 0 (no action)and 1 (defense). The expected material payoff to such a player i, under any pure-

strategy profile s = (s1, s2, ..., sn), is

π (si, s−i) =

((1− p)w/n+ p [(w/n− d/n) /n+ (1− 1/n) yi] for si = 0(1− p)w/n+ p [(w/n− c) /n+ (1− 1/n) yi] for si = 1

, (16)

where yi is the conditionally expected value to member i of his or her share of the

common pool when another group member is on guard, given that the hazard hits:

yi =1

n− 1

"Ãn− 1−

Xj 6=i

sj

!w − d

n+

w

n

Xj 6=i

sj

#.

Under the maintained hypothesis d/n < c < d, pure strategy 0 strictly dominates

pure strategy 1. Hence, as noted by Boyd and Richerson (1985), the standard replicator

dynamic, applied to the game defined in terms of individual material payoffs, asymp-

totically wipes out strategy 1 from the population, from any interior initial population

state. What happens in the present selection dynamic?

15

Page 16: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

In our framework, and focusing on the case n = 2, the material payoff matrix is

Π =1

2

Ã(1− p)w + p (w − d) (1− p)w + p (w − d/2)

(1− p)w + p (w − c− d/2) (1− p)w + p (w − c)

!

For p = 1, that is, a sure hazard, the derived payoff matrix is

Π =

Ãφ [(w − d) /2, (w − d) /2] φ [(w − d/2) /2, (w − c− d/2) /2]

φ [(w − c− d/2) /2, (w − d/2) /2] φ [(w − c) /2, (w − c) /2]

!

With bilinear fitness (10), defence of the common property, pure strategy 1, is

strictly dominant in the derived game if and only if

(w − d)2 <

µw − c− d

2

¶µw − c+ d

2

¶(17)

Figure 7 below shows when this condition is satisfied, in the (c, d)-plane, for w = 2.

The condition is satisfied to the left of the curve. For parameter pairs (c, d) to the left

of the steeper straight line (c = d/2), pure strategy 1 is strictly dominant also in terms

of individual material payoffs.

1.51.2510.750.50.250

2

1.5

1

0.5

0

c

d

c

d

Figure 7: Parameter combinations (c, d) for which defence of a common property is a

dominant strategy in the derived game.

For parameter pairs to the right of the less steep straight line (c = d), strategy 1 is

social inefficient: the group’s total material payoff is maximized if the guard takes no

action. Hence, the effect of group selection is to expand the set of parameter combi-

16

Page 17: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

nations (c, d) for which pure strategy 1 is dominant, from the region to the left of the

steeper straight line to all points to the left of the curve. For parameter combinations

to the left of the curve, our selection dynamic, starting from any interior initial popu-

lation state, will asymptotically wipe out strategy 0 (“no action”) from the population.

Group selection, as modelled here, is then sufficiently strong to asymptotically lead the

population towards the state where all group members defend the common pool, and

hence to the socially efficient outcome: in the long run, individuals behave as if they

were rational and had preferences according to the derived payoff matrix.

4 The evolutionary logic of rewards and punish-

ments

We her some general remarks about the rationality of punishments and rewards. Con-

sider, thus, a behavior strategy profile in a finite extensive-form game and an infor-

mation set on the path of this profile (that is, an information set that is reached with

positive probability when the profile is played). A behavior strategy is sequentially ra-

tional (Kreps and Wilson, 1982) at such an information set if its conditionally expected

payoff, conditioned on the induced probabilities at its nodes, cannot be exceeded by

any other behavior strategy, when used from this information set on. If the payoffs in

the game tree are the derived payoffs, as defined here, then Lyapunov stability in our

selection dynamic implies sequential rationality at all information sets that are reached

with positive probability by play.9

Consider now a decision node in a finite extensive-form game where the player

has perfect information about what others have done before his or her move (thus, a

singleton information set) and where each move at the node is immediately followed by

a terminal node in the game tree. Suppose, first, that the player i in question has the

option to punish another player j. Let vi and vj be the two player’s material payoffs if

i chooses not to punish j, and let the payoffs be vi − c and vj − d if i does choose to

punish j, where c, d > 0. In terms of material payoffs, it is thus sequentially rational

not to punish. The same holds true in terms of derived payoffs, if these are increasing

functions of material payoffs, since the derived payoff to player i from punishing j,

φ (vi − c, vj − d), is then lower than the payoff from not punishing, φ (vi, vj). Hence, our

9This follows from the two facts that (a) Lyapunov stability in the replicator dynamic implies Nashequilibrium, and (b) a behavior strategy profile in a finite extensive form game is a Nash equilibriumiff it prescribes sequentially rational play at all information sets on its path, see van Damme (1987).

17

Page 18: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

model of group selection does not render punishment sequentially rational. However,

as was shown above, group selection may have significant effects on the set of stable

population states.

Secondly, suppose that player i instead has the option to reward another player j.

Let vi and vj be the two player’s material payoffs if i does not reward j, and let the

payoffs be vi − c and vj + r if i rewards j, where c, r > 0. In terms of material payoffs,

it is sequentially rational for player i not to reward player j. However, rewarding is

sequentially rational in terms of derived payoffs if and only if

φ (vi − c, vj + r) ≥ φ (vi, vj) . (18)

With bilinear fitness (10), this condition is equivalent with³rc− 1´(vi − c) ≥ vi + vj. (19)

For c < vi, the latter condition requires r > c. In other words, a necessary (but not

sufficient) condition for rewarding to be sequentially rational is that the reward exceeds

the cost of giving it. Figure 8 below illustrates condition (19); rewarding is sequentially

rational for all pairs (r, c) on and above the steep curve, and not below it (the diagram

is drawn for vi = vj = 1). The thin straight line is r = c.

0.80.60.40.20

8

6

4

2

0

c

r

c

r

Figure 8: Parameter combinations (c, r) and the condition for rewarding others to be

sequentially rational.

18

Page 19: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

5 Social preferences

The analysis so far has assumed that individuals do not choose a strategy; they are “pro-

grammed” to a pure strategy from birth. However, the long run of the present model

can be interpreted in terms of rational individual choice and expectations formation

as follows. Let the underlying material payoffs be given by the matrix Π and consider

the derived payoff matrix Π as a representation of rational individuals’ von Neumann

Morgenstern utilities. We here use the term “rational” in its usual economics sense (pi-

oneered by Savage (1954)), that is, behavior that is consistent with the maximization

of the expected value of some goal function under some subjective probabilistic belief

about the state of the world (here: other players’ actions).10

It is well-known that in finite two-player games, a pure strategy is strictly dom-

inated if and only if it is not optimal against any probability distribution over the

other player’s strategy choice (Pearce, 1984). It is known that the replicator dynamic

asymptotically wipes out all iteratively strictly dominated pure strategies, from any in-

terior initial population state, in all finite games.11 Hence, observed play in the present

model, after the population has evolved over a long time span from an arbitrary interior

initial state, is consistent with the hypothesis that all individuals have von Neumann-

Morgenstern utilities (6), are rational, and that their preferences and rationality are

common knowledge in the population.12

Let us consider the example in section 3.1 in this light. Assume a = 0.8 and b = 2.2.

From Figure 1 we deduce that in terms of material payoffs, the game is a prisoners’

dilemma with strategy 1 being C. However, in terms of the derived payoffs strategy C

strictly dominates strategy D. The present selection dynamic thus takes the population

state from any interior state to the state in which everybody plays C. Hence, to an

outside observer who sees this long-run outcome, and who knows the material payoffs

Π, this observation is consistent with the hypothesis that all individuals are rational and

have von Neumann Morgenstern utilities (6) as detailed in (10). Such preferences are

“social” or “other-regarding:” they depend both on ownmaterial payoff and on the other

10von Neumann-Morgenstern utilities are real numbers u (ω) attached to outcomes ω in such a waythat the decision maker’s choices among lotteries over outcomes are consistent with maximization ofthe expected value of u.11Akin (1980) showed that all strictly dominated strategies are wiped out in this sense in all finite and

symmetric two-player games. This result was generalized to iteratively strictly dominated strategiesin arbitrary finite games, for both the Taylor (1979) and Maynard Smith (1982) versions of the multi-population replicator dynamic by Samuelson and Zhang (1992) and Hofbauer and Weibull (1996).12A player who knows other players’ preferences and that other players are rational does not use

iteratively strictly dominated strategies.

19

Page 20: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

player’s material payoff. Likewise, the long-run population state in the common-pool

game in section 3.3 is consistent with rational players who have “social” preferences,

granted the parameters satisfy condition (17).

It is known that if the replicator dynamic converges to some population state from

an interior initial state, then the limit state is a Nash equilibrium.13 Hence, observation

of play after the population has evolved for a long time along some interior convergent

solution trajectory is consistent with the hypothesis that all individuals have preferences

according to (6) and play (approximately) according to the limiting Nash equilibrium.

Consider the public-goods example in section 3.2 in this light, for, say, n = 4 and

λ = 0.7. If the population behavior settles down over time from some interior initial

state, and an observer sees the long-run behavior, then this observation is consistent

with the hypothesis that all individuals have von-Neumann-Morgenstern utilities (??),and play (approximately) the game’s unique symmetric Nash equilibrium, namely to

contribute maximally to the public good. Again, it is as if individuals were rational,

and had “rational” expectations and “social” preferences.

The above observations substantiate the first part of the above claim, namely that

the present model can be viewed as a model of rational choice and, sometimes, Nash

equilibrium play, and it suggests that the so revealed preferences may be social in nature.

In order to substantiate this second claim further, we now examine more closely the

nature of the derived payoffs, bearing in mind that the fitness function φ in its turn may

depend on material production and reproduction conditions as well as forms of social

interaction, “institutions”, and habits, of the population under study. In particular, as

these conditions change, so may the induced social preferences.

Assume, first, that the fitness function φ is bilinear, as specified in equation (10).

The figure below shows a contour map of this function, hence, the indifference curves

of a player with such preferences, with own material payoff on the horizontal axis and

the other player’s material payoff on the vertical. The two straight lines have slope plus

and minus 45 degrees.

13See Nachbar (1990) and Weibull (1995).

20

Page 21: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

876543210

8

7

6

5

4

3

2

1

0

own

other's

own

other's

Figure 9: Indifference curves induced by the bi-linear fitness function.

We see that the indifference curves are not the vertical ones of homo oeconomicus

– the selfish species studied in most of economics. Indeed, an individual with φ as

his or her utility function prefers the “fair” payoff allocation (3, 3) to one where he/she

gets 4 material payoff units and the other zero. In this sense, individuals have a certain

“preference for fairness.” However, as the negatively sloped “budget line” shows: if

there is a given material payoff sum to be divided (here 6 units), then each individual

prefers to get the whole “pie” for him- or herself. It is as if others’ material well-being

is of some concern, but less so than one’s own, in particular when others are better

off. We also note that along the positively sloped diagonal, where individual material

payoffs are equal, the indifference curves become steeper as the common material payoff

goes up. It is as if equally wealthy individuals care less about each others’ well-being –

are more similar to homo oeconomicus – than equally poor individuals do: a wealthy

person is “more upset” by a marginal transfer to an equally wealthy person than a poor

person is by a marginal transfer to an equally poor person.

What about other group fitness functions? A more general class of fitness functions

are given by

φ (vi, v−i) = vi · ψ (v) , (20)

21

Page 22: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

for

ψ (v) =

"1

n

nXj=1

µvj − π−

π+ − π−

¶ρ#1/ρ

, (21)

where ρ ∈ R andπ− ≤ min

h,k∈Sπhk and π+ ≥ max

h,k∈Sπhk (22)

(we assume π− < π+). The function value ψ (v) may be thought of as the survival

probability of an offspring in a group with material payoff vector v = (v1, ..., vn). By

(21), this probability is a symmetric, continuous and strictly increasing function of

the payoff vector v. It belongs to a parametric function family called CES (constant

elasticity of substitution) functions in economics. As is shown in an appendix at the

end of the paper, the function ψ has the following properties:14

ψ (v) =1

π+ − π−·

⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩1n

Pnj=1 (vj − π−) when ρ = 1£

Πnj=1 (vj − π−)

¤1/nwhen ρ = 0

min1≤j≤n (vj − π−) when ρ→ −∞max1≤j≤n (vj − π−) when ρ→ +∞

. (23)

In other words, ψ (v) is proportional to the arithmetic average payoff gain in the group

when ρ = 1, to the geometric average payoff gain when ρ = 0, and its limit as ρ→−∞(ρ → +∞) is the minimal (maximal) payoff gain in the group. These special casesrepresent established social welfare functions in economics: ρ = 1 corresponding to

Bentham’s utilitarian welfare function, ρ = 0 to Nash’s bargaining-based welfare func-

tion, and ρ→ −∞ to Rawls’ egalitarian welfare function. Moreover, if conditions (22)

are met with equality, then ψ (v) is scale invariant in the sense of being unaffected by

positive affine transformations of material payoffs.15 As a consequence, the solution

orbits to our selection dynamic (4) are then invariant under positive affine transforma-

tions of material payoffs, a form of scale invariance shared with dominance relations

and Nash equilibrium.16

Isoquants for the fitness function φ through the point vi = v−i = 3 are shown in

Figure 10 below, for n = 2, π− = 0, and different values of ρ. The horizontal axis

14The right-hand side in (21) is undefined for ρ = 0, but the given expression is the limit as ρ→ 0.Hence, this expression is the continuous extension of the definition for ρ 6= 0 to ρ = 0.15More exactly, if each argument vi is replaced by v0i = a + bvi for some a, b ∈ R with b positive,

then ψ (v0) = ψ (v).16Alternatively, background fitness factors could be incorporated by means of shifting π− downwards

by the corresponding amount in material payoffs.

22

Page 23: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

represents own material payoff, the vertical axis the other player’s material payoff, and

the curves are ordered from left to right according to the ρ-values. The left-most curve

is the indifference curve when ρ → +∞ and the right-most curve when ρ → −∞.The bold-face curve in the middle is the indifference curve when ρ = 1. The two

straight lines are v−i = vi and v−i = 6 − vi, respectively. We note, in particular, that

in the limit case as ρ → −∞, individuals have the same selfish preferences as homooeconomicus whenever the other individual in the group earns a higher material payoff

(the indifference curve is vertical above the diagonal) but prefer the even split over all

other allocations of a given material payoff sum.

876543210

8

7

6

5

4

3

2

1

own

other's

own

other's

Figure 10: Isoquants of (20), when parametrized as in (21), for different values of ρ.

Preferences represented by (20) and (21) are qualitatively similar to those in Fehr

and Schmidt (1999) and Charness and Rabin (2002). Fehr and Schmidt assume that

an individual’s utility is composed of three terms, where one is own material payoff

and the two others represent “fairness” with regard to those who are better and worse

off, respectively. Charness and Rabin (2002) suggest that much experimental evidence

for human subjects is consistent with individual utility being the sum of own material

payoff and a social welfare function for all participants. The only essential difference

in comparison with the present functions φ is that the latter are multiplicative, not

additive, and, unlike Fehr and Schmidt, cannot express spitefulness – a wish to reduce

23

Page 24: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

other’s material payoffs.17

6 Related literature

As mentioned above, the model in Maynard Smith (1964) is adiabatic, with individual

selection working on a faster time scale than group selection. Behaviors within each

group thus first converges to a steady state determined by individual selection, before

selection among groups take place. By contrast, Kerr and Godfrey-Smith (2002) allow

individual and group selection to operate on the same time scale, and discuss individu-

alist and multi-level perspectives on natural selection. In their vocabulary, the present

model is an example of the contextual approach, where individuals are the bearers of

fitness and fitness is sensitive to the context of the individual. They contrast this ap-

proach with what they call the collective approach, where instead collectives (groups)

are fitness-bearing entities in their own right. Their model is not game-theoretic, how-

ever, and their analysis is only remotely related to ours.

Another strand of the literature is the so-called indirect evolutionary approach, pio-

neered by Güth and Yaari (1992).18 In that approach, individuals are randomly drawn

from large populations to play a game defined in terms of material payoffs, just as here.

However, individuals have different preferences, and, by assumption, play some equi-

librium either of the game defined in terms of the drawn individuals’ preferences or in

the game defined by the population distribution of preferences. The drawn individuals

receive material payoffs according to the underlying game. Preferences that result in

higher material payoffs in the corresponding equilibria are selected for. The indirect

evolutionary approach is thus quite distinct from the present. Closest to our approach

among those models is that in Herold (2004), who studies preferences for rewarding al-

truistic behaviors and punishing hostile or selfish behaviors (c.f. Section 3.3 above). He

shows that such preferences can survive in the indirect evolutionary approach with ran-

domly formed groups, even when individual preferences are unobservable. His results

do rely, however, on the assumption that individuals can condition on the preference

distribution in their own group, an assumption not made in the present approach.

17Spiteful social preferences could arise under group selection as modelled here if an increase inothers’ material payoffs (for example when these are above one’s own) would reduce the survivalprobability of own offspring. The symmetry of the survival function ψ would then have to be replacedby symmetry with respect to other group members’ material payoffs.18Se also Bester and Güth (1998), Huck and Oechssler (1999), Sethi and Somanathan (2001), Güth

and Peleg (2001) and Herold (2004).

24

Page 25: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

Kuzmics (2003) develops a model of simultaneous individual and group selection

for symmetric two-player games. Like here, he considers the mean-field equation in a

large population. However, unlike here, the number of groups is finite, so each group is

large when the total population is made large in the mean-field approximation. Groups

may thus be though of as subpopulations. Individual selection operates within each

subpopulation, but group selection is driven by migration. More exactly, the model op-

erates in continuous time by way of a Poisson arrival process of migration-cum-imitation

opportunities for each individual. When an individual gets such an opportunity, she

migrates to another group with a probability that is increasing in that group’s current

average material payoff. Whether or not she migrated, she then imitates a randomly

drawn individual in her chosen subpopulation, with a higher probability to imitate

an individual with a higher current material payoff. When applied to CO-games, the

long-run prediction is that all individuals play the pure strategy that yields the highest

material payoff. This contrasts with the predictions of the current model, where other

outcomes are possible. The reason for the tendency towards joint payoff maximization

in Kuzmics’ model is migration, which, by assumption, is directed towards subpopula-

tions with high average payoffs. When applied to PD-games, the long-run prediction

diverges. Individual selection in favor of strategy D is counter-acted by migration to

groups where many play C, resulting in perpetual fluctuations in the population state.

Again, the prediction differs from that of our model.

In Vega-Redondo (1996), a finite population of individuals are recurrently and ran-

domly matched in pairs to play a prisoners’ dilemma. Time is divided into an infinite

sequence of discrete periods, and with each such period, every individual is matched

with another individual. Each matched pair forms a group, and there is both indi-

vidual and group selection at the end of each time period. First, individual selection

takes place: each individual switches to the strategy that yielded the highest payoff

in its group. Thereafter, a mutation may take place: with a very small probability

the selected strategy is replaced by the other pure strategy. Third, group selection

takes place: each group is disbanded with a positive probability, and the members of

disbanded groups switch to the strategies used in those groups that earned the high-

est payoff sum. This defines an ergodic population process, and Vega-Redondo (1996)

shows that if the mutation rate is very small and the population large, its invariant

distribution places virtually all probability mass on the population state in which all

individuals cooperate. This result contrasts with ours, where this is the long-run out-

come only for certain parameter values (see section 3.1). The reason for Vega-Redondo’s

25

Page 26: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

drastic result seems to be that if the population initially is in the pure-C state, and, say,

a pair of individuals mutates to (D,D), then group selection will bring the pair back to

(C,C) as soon as it disbands, and this happens with a probability that is “infinitely”

higher than a mutation. Hence, while Maynard Smith’s haystack model can be said to

be tilted in favor of individual selection by letting this operate at a faster time scale

than group selection, Vega-Redondo’s model can be said to be tilted in favor of group

selection by giving group selection full swing as soon as a single group has mutated to

a “better” strategy profile.

The model in Sjöström and Weitzman (1996), finally, is not game-theoretic but

deals with the same issues as here, in the context of the internal efficiency of competing

firms. There are infinitely many firms and each firm has the same finite number of

workers. Within a firm, each worker is “programmed” to some effort level. All workers

in each firm are paid a wage that depend on their average effort. The resulting utility or

fitness to a worker is the wage minus a “cost” or “disutility” of exerting effort. Workers

also have an outside option (for instance, self employment) that gives them a fixed

utility/fitness. The (owner of the) firm sets the wage so as to keep the workers indifferent

between staying and leaving. Suppose that initially all workers in a firm make the same

effort, and that one worker suddenly mutates to a lower effort level. This worker’s

individual utility/fitness goes up, and by individual selection this becomes the worker’s

new effort level. All other workers in the firm imitate the mutant, so the firm ends up

with all workers exerting the lower effort. Hence, individual mutation-cum-selection

drives efforts down towards zero within each firm. However, this force is counteracted

by a form of firm selection. Now and then a pair of firms is randomly drawn, and

the effort level in the firm with higher profit is “copied” over to the other firm. It is,

thus, as if the more profitable firm takes over the niche of the other firm. Sjöström and

Weitzman (1996) show that the ratio between the individual mutation-cum-selection

rate and the firm selection rate is crucial for the long-run outcome. An increase in

this ratio unambiguously increases efficiency (in the sense of stochastic dominance).

At a first glance, this model may seem similar to the present prisoners’ dilemma and

public-goods games. However, all workers are equally well off in all population states,

while this is not true for players in prisoners’ dilemma games and public-goods provision

games. The key difference is that the firms in Sjöström and Weitzman (1996), when

viewed as groups, are asymmetric, where one group member absorbs all excess payoff.

26

Page 27: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

7 Concluding remarks

The aim of this study was to construct a parsimonious population selection model that

allows for both individual and group selection, without the adiabatic feature of the

haystack model.

Some earlier models have modelled group selection as a pairwise contest between

groups; out of two randomly selected groups, the one with highest group fitness “wins,”

in terms of future population shares. Instead, we have a very large number of groups

who all “compete” with each other, where groups with higher group fitness “win,” in

terms of future population shares, over those with lower group fitness. In this respect,

our approach is similar in spirit to how perfect competition is modelled in economics;

where no individual firm can affect the market conditions of any other firm, while

aggregate firm (and consumer) behavior determines the market conditions of all firms.

We hope to have shown that the analytical power that follows from this approach allows

for straight-forward analyses of relevance to evolutionary, experimental and behavioral

game theory.

Our approach calls for many extensions, including selection among multi-level hi-

erarchies. It also calls for a careful analysis of the full stochastic process that arises

when the population is large but finite. Another aspect that deserves more attention

is the fitness function φ, which here is treated as a primitive. This function is a “re-

duced form” representation of group organization and its environment, where group

organization in turn has a technological and a cultural or habitual side, representing

different forms of collective local organization in different natural environments. One

could thus conceive of selection of such group organization forms in different natural

environments, thus rendering the function φ endogenous. However, these topics bring

us outside the scope of the present, more limited, study.

8 Appendix

We prove the claims in the special case n = 2. Let

f (x1, x2, ρ) = [θxρ1 + (1− θ)xρ2]

1/ρ ,

27

Page 28: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

for ρ ≤ 1, x1, x2 > 0 and θ ∈ (0, 1). Applying the Taylor expansion twice for ρ close tozero, we obtain

ln f (x1, x2, ρ) = lnx1 +1

ρln

∙θ + (1− θ)

µx2x1

¶ρ¸= lnx1 +

1

ρln

∙θ + (1− θ) exp

µρ ln

x2x1

¶¸= lnx1 +

1

ρln

∙θ + (1− θ)

µ1 + ρ ln

x2x1+O

¡ρ2¢¶¸

= lnx1 +1

ρln

∙1 + (1− θ) ρ ln

x2x1+O

¡ρ2¢¸

= lnx1 + (1− θ) lnx2x1+O (ρ) = lnxθ1x1−θ2 +O (ρ) ,

proving the claim for ρ = 0. Assume x1 < x2. The claim for ρ → −∞ follows

immediately from

limρ→−∞

ln f (x1, x2, ρ) = limρ→−∞

µlnx1 +

1

ρln

∙θ + (1− θ) exp

µρ ln

x2x1

¶¸¶= lnx1.

Finally, assume x1 > x2. Then

limρ→+∞

ln f (x1, x2, ρ) = lnx1 + limρ→+∞

1

ρln

∙θ + (1− θ) exp

µρ ln

x2x1

¶¸= lnx1.

References

Akin, E., 1980. Domination or equilibrium, Mathematical Biosciences 50, pp. 239-50.

Bergstrom, T.C., 2002. Evolution of social behavior: Individual and group selection.

Journal of Economic Perspectives 16, pp. 67-88.

Binmore, K., Samuelson, L., 1999. Evolutionary drift and equilibrium selection. The

Review of Economic Studies 66, pp. 363-393.

Boyd, R., Richerson, P.J., 1985. Culture and the Evolutionary Process. University of

Chicago Press, Chicago.

28

Page 29: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

Charness, G., Rabin, M., 2002. Understanding social preference with simple tests. Quar-

terly Journal of Economics117, pp. 817-869.

Darwin, C., 1871. The Descent of Man and Selection in Relation to Sex. Murray, Lon-

don.

Fehr, E. and Gächter, S., 2002. Altruistic punishment in humans. Nature 415, pp.

137-140.

Fehr, E. and Schmidt, K. M., 1999. A theory of fairness, competition, and cooperation.

The Quarterly Journal of Economics 114, pp. 817-868.

Güth, W., Bester, H., 1998. Is altruism evolutionarily stable? Journal of Economic

Behavior and Organization 34, pp. 193-200.

Güth, W., Yaari, M., 1992. An evolutionary approach to explain reciprocal behavior in a

simple strategic game. InWitt U. (ed.) Explaining Process and Change–Approaches

to Evolutionary Economics, pp. 23-34. The University of Michigan Press, Ann Arbor.

Güth, W., Peleg, B., 2001. When will payoff maximization survive? Journal of Evolu-

tionary Economics 11, pp. 479-499.

Henrich, J., 2003. Cultural group selection, coevolutionary processes and large-scale

cooperation. Journal of Economic Behavior and Organization 53, pp. 3-35.

Herold, F., 2004. Carrot or stick? The evolution of reciprocal preferences in a haystack

model. Mimeo, Department of Economics, University of Munich.

Hofbauer J., Weibull, J., 1996. Evolutionary selection against dominated strategies,

Journal of Economic Theory 71, pp. 558-573.

Huck, S., Oechssler J., 1999. The indirect evolutionary approach to explaining fair

allocations. Games and Economic Behavior 28, pp. 13-24.

Kandori, M., Mailath, G.J., Rob, R., 1993. Learning, mutation, and long run equilibria

in games. Econometrica 61, pp. 29-56.

Kerr, B., Godfrey-Smith, P., 2002. Individualist and multi-level perspectives on selec-

tion in structured populations, Biology and Philosophy 17, pp. 477-517.

Kreps, D., Wilson, R., 1982. Sequential equilibria. Econometrica 50, pp. 863-894.

29

Page 30: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

Kuzmics, C., 2003. Individual and group selection in symmetric 2-player games. Mimeo,

Kellogg School of Management, Northwestern University.

Maynard Smith, J.M., 1964. Group selection and kin Selection. Nature 201, pp. 1145-

1147.

Maynard Smith, J., 1982. Evolution and the Theory of Games. Cambridge. Cambridge

University Press.

Nachbar, J., 1990. "Evolutionary" selection dynamics in games: Convergence and limit

properties. International Journal of Game Theory 19, pp. 59-89.

Pearce, D., 1984. Rationalizable strategic behavior and the problem of perfection,

Econometrica 52, pp. 1029-1050.

Samuelson, L., and Zhang, J., 1992. Evolutionary stability in asymmetric games. Jour-

nal of Economic Theory 57, pp. 363-91.

Savage, L., 1954. The Foundations of Statistics. Dover.

Sethi, R., Somanathan, E., 1996. The evolution of social norms in common property

resource use. American Economic Review 86, pp. 766-88.

Sethi, R., Somanathan, E., 2001. Preference evolution and reciprocity. Journal of Eco-

nomic Theory 97, pp. 273-297.

Sjöström T., Weitzman, M. L., 1996. Competition and the evolution of efficiency. Jour-

nal of Economic Behavior and Organization 30, pp. 25-43.

Taylor, P., 1979. Evolutionary stable strategies with two types of player. Journal of

Applied Probability 16, pp. 76-83.

Taylor, P., Jonker, L., 1978 Evolutionary stable strategies and game dynamics, Math-

ematical Biosciences 40, pp. 145-56.

van Damme, E., 1987. Stability and Perfection of Nash Equilibrium. Berlin. Springer

Verlag.

Vega-Redondo, F., 1996. Long-run Cooperation in the one-shot prisoner’s dilemma: A

hierarchic evolutionary approach. Biosystems 37, pp. 39-47.

Weibull, J., 1995. Evolutionary Game Theory, MIT Press, Cambridge.

30

Page 31: Group selection and social preferences · 2017. 4. 12. · Jörgen Weibull† and Marcus Salomonsson‡ Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden April

Williams, G. C., 1966. Adaptation and natural selection. Princeton University Press,

Princeton.

Wilson D.S., 1983. The group selection controversy: History and current status. Annual

Review of Ecology and Systematics 14, pp. 159-187.

Wilson, D.S., Sober, E., 1994. Reintroducing group selection to the human behavioral

sciences. Behavioral and Brain Sciences 17, pp. 585-654.

Wynne-Edwards, V. C., 1962. Animal Dispersion in Relation to Social Behavior. Oliver

and Boyd, Edinburgh.

31