Download - DESIGN AND ANALYSIS OF LOYALTY REWARD PROGRAMS A …jv609vj6030/argoel_thesis... · Coalition loyalty programs are agreements between merchants allowing their customers to ex-change

DESIGN AND ANALYSIS OF LOYALTY REWARD PROGRAMS

A DISSERTATION

SUBMITTED TO THE DEPARTMENT OF MANAGEMENT SCIENCE &

ENGINEERING

AND THE COMMITTEE ON GRADUATE STUDIES

OF STANFORD UNIVERSITY

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS

FOR THE DEGREE OF

DOCTOR OF PHILOSOPHY

Arpit Amar Goel

March 2017

http://creativecommons.org/licenses/by-nc/3.0/us/

This dissertation is online at: http://purl.stanford.edu/jv609vj6030

© 2017 by Arpit Amar Goel. All Rights Reserved.

Re-distributed by Stanford University under license with the author.

This work is licensed under a Creative Commons Attribution-Noncommercial 3.0 United States License.

ii



http://purl.stanford.edu/jv609vj6030

I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.

Ashish Goel, Primary Adviser


Dan Iancu


Ramesh Johari

Approved for the Stanford University Committee on Graduate Studies.

Patricia J. Gumport, Vice Provost for Graduate Education

This signature page was generated electronically upon submission of this dissertation in electronic format. An original signed hard copy of the signature page is on file inUniversity Archives.

iii

Summary

This thesis provides an in-depth analysis of two major components in the design of loyalty reward

programs. First, we discuss the design of coalition loyalty programs - schemes where customers

can earn and spend reward points across multiple merchant partners. And second, we conduct a

model based comparison of a standalone loyalty reward program against traditional pricing - we

theoretically characterize the conditions under which it is better to run a reward program within a

competitive environment.

Coalition loyalty programs are agreements between merchants allowing their customers to ex-

change reward points from one merchant to another at agreed upon exchange rates. Such exchanges

lead to transfer of liabilities between merchant partners, which need to be frequently settled using

payments. We first conduct an empirical investigation of existing coalitions, and formulate an an-

alytical model of bargaining for merchant partners to agree upon the exchange rate and payment

parameters. We show that our bargaining model produces networks that are close to optimal in

terms of social welfare, in addition to cohering with empirical observations. Then, we introduce

a novel alternate methodology for settling the transferred liabilities between merchants participat-

ing in a coalition. Our model has three interesting properties – it is decentralized, arbitrage-proof,

and fair against market power concentration – which make it a real alternative to how settlements

happen in coalition loyalty programs.

Finally, we investigate the design of an optimal reward program for a merchant competing against

a traditional pricing merchant, for varying customer populations, where customers measure their

utility in rational economic terms. We assume customers are either myopic or strategic, and have

a prior loyalty bias toward the reward program merchant, drawn from a known distribution. We

show that for the reward program to perform better, it is necessary for a minimum fraction of the

customer population to be strategic, and the loyalty bias distribution to be within an optimal range.

This thesis is a useful read for marketers building promotional schemes within retail, researchers

in the field of marketing and behavioral science, and companies investigating the intersection of

customer behavior, loyalty, and virtual currencies.

iv

Acknowledgments

My PhD is a culmination of five years of professional and personal development which would not

have been possible without the support from numerous people who I would like to extend a very

special thanks to.

First, I thank my advisor, Ashish Goel, for his continuous guidance and support. Ashish has

a very strong ability to identify problems that have long term impact. Back in 2012, within our

first few interactions, some of which included key industry leaders, we were able to formulate an

abstract research problem that almost served as my dissertation thesis. Consequently, our research

proposal was not only theoretically relevant, but practical to the industry. In addition, Ashish has

been a really patient and encouraging advisor. The course of PhD is full of ups and downs, and

he made sure to find interesting collaborations for me during the low times, which kept me excited

toward the bigger picture. And most importantly, I learnt from him key interpersonal skills like

clear articulation, communication, and disciplined committment toward finishing goals.

I thank Ramesh Johari and Dan Iancu for serving on my reading committee and taking out the

time to read through my dissertation to o↵er valuable comments; and Professor Warren Hausman

and Itai Ashlagi for serving on my oral defense committee. Teaching is indeed the best way to learn

a subject, and I gained tremendous knowledge across algorithms, optimization, and data science by

o↵ering my services as a teaching assistant to di↵erent classes. I would like to thank professors who

provided me with invaluable teaching experience in the past years: Ramesh Johari, Yinyu Ye, Ashish

Goel, and Tim Roughgarden. And special thanks to Professor Yinyu Ye and Tim Roughgarden for

introducing me to research in my very first year at Stanford. I entered Stanford as a Masters student

back in 2011, and I could not have transitioned into PhD research without the guidance and support

from the two of them.

Stanford is indeed a magical place when it comes to the quality of education alongwith the

diversity of thoughts. I was fortunate to make use of this diversity to a great extent. I thank Matt

Jackson for his course on Social and Network Economics; Serge Plotkin for his Algorithms series

courses which I thoroughly enjoyed; Ben Van Roy for his course series on Dynamic Programming;

Yinyu Ye for his course series on Optimization; and Peter Glynn for Stochastic Processes. All

these classes provided insightful information for my research. In addition, I thank Ann Grimes and

v

Stanford Venture Studio for introducing me to the entrepreneurial ecosystem at Stanford; Andrew

Ng, Chris Manning, and Jure Leskovec for their classes on machine learning, natural language

processing, and data mining. And a very big thanks to my music teachers at Stanford - Timothy

Zerlang for teaching me Piano and Claire Giovannetti for teaching me singing.

My Stanford experience is truly incomplete without the mention of the numerous collaborations,

relationships, and friendships I developed, some of which I hope will last throughout my life. I

thank Ali Dasdan for hosting me at Turn Inc. for the summer in 2012; Gloria Lau and Craig

Martell for hosting me at LinkedIn Corp. for the summer in 2014; Sal Uryasev for being an awesome

mentor for my summer internship at LinkedIn; and postdocs Pranav Dandekar and Sid Banerjee for

helping me learn research fundamentals during my early days. I thank the amazing cohort of my lab

friends - Shayan, Peter, Hong, Hongsek, Camelia, Vijay, Nikhil, Carlos, Nolan - with whom I not

only shared the frustrations during research, but also had many fun conversations around politics,

science, philosophy, history, and religion. And a very big thanks to some of my really close friends -

Bharath, Subodh, Anshul, Raghu, Bobo, Nipun, Sparsh, Aju, Rose, Navneet, Apaar for o↵ering me

support like a family when I needed it the most. Specially Bharath and Subodh for the few amazing

“startup” projects we worked on together, and will continue to work on. A very special thanks to

my best friend Purvi for really taking out the energy to go through every aspect of my thesis and

calm me down to simplify this enormous journey toward the end.

Finally, and most importantly, I thank my family. My parents taught me the principles of

committment and discipline for hard work. My nanny (I call her pappy) played an important role

in my upbringing and I can’t thank her enough for teaching me the simplicity of life. My elder

siblings, Ambika and Shakti, have always guided me throughout my schooling and education. I

owe most of my knowledge to the two of them. Ambika, and my brother in law Anurag, have been

very supportive during the five years of my PhD, and have frequently visited me in times of need.

The past few years have been challenging not just with the PhD research, but because we all went

through a lot as a family. Our ability to stand tall together and support each other during such

times has led me to this milestone.

vi

Contents

Summary iv

Acknowledgments v

1 Introduction 1

2 Network Formation of Coalition Loyalty Programs 6

2.1 Bilateral Negotiation Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2 Empirical Investigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.3 Model Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.5 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.5.1 Analysis of Social Welfare Gap Example . . . . . . . . . . . . . . . . . . . . . 19

2.5.2 Proof of Proposition 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.5.3 Proof of Theorem 2.3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3 Credit Settlement in Coalition Loyalty Programs 21

3.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.4 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.4.1 Proof of Theorem 3.2.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4 Optimal Design of a Frequency Reward Program 33

4.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4.1.1 Customer Behavior Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4.1.2 Merchant Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4.2.1 Customer Choice Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

vii

4.2.2 Merchant Objective Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . 42

4.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.4 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.4.1 Proof of Lemma 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.4.2 Proof of Lemma 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

5 Future Directions 56

viii

List of Figures

1.1 Some Examples of Coalition Loyalty Programs. . . . . . . . . . . . . . . . . . . . . . 2

2.1 Extended Partners of Star Alliance Members . . . . . . . . . . . . . . . . . . . . . . 7

2.2 Exchange Rates within the Star Alliance Program . . . . . . . . . . . . . . . . . . . 13

2.3 Partnerships Across Multiple Merchants . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.1 Illustrative Example of Dynamic IOU Settlement . . . . . . . . . . . . . . . . . . . . 22

3.2 Transaction from u to v Leading to Cycle C . . . . . . . . . . . . . . . . . . . . . . . 25

4.1 Rate of revenue for reward merchant as a function of ↵ (with k = e↵(1��) ) for di↵erent

distributions. For all distributions, � = 0.9, p = 0.9 and v varies as labeled. The

uniform distribution is on (0, b] with b = 0.9; the normal distribution has µ = 0.5 and

� = 0.1; and the logit-normal distribution is the standard on [0, 1]. . . . . . . . . . . 44

4.2 Regions where RoRA > RoRB (blue), where RoRA > b2

(yellow) and where both are

true (green) for di↵erent values of ↵. In all cases, � = 0.95, v = 0.05 and � drawn

uniformly on (0, b]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.3 The upper and lower bounds on b as a function of p. Here v = 0.05 and ↵ ! e. . . . 49

4.4 Bounds on b for various values of p and v at ↵ ! e. Top shows lower bounds on b for

RoRA � RoRB and bottom shows upper bounds of b for RoRA � b2

. . . . . . . . . . 51

ix

Chapter 1

Introduction

Loyalty reward programs constitute a huge market in consumer retail, primarily serving as a mech-

anism for customer acquisition and retention (Sharp and Sharp [1997], Taylor and Neslin [2005]).

Over 48 billion dollars in perceived value of rewards is issued in the United States alone every year,

with every household having over 19 loyalty memberships on an average (Berry [2013]). This mar-

ket constitutes global companies like credit cards, hotel and airline reward programs, and even local

merchants like restaurants, grocery, and retail stores.

There are many di↵erent forms of loyalty reward programs. One common form are frequency

reward programs, such as airline frequent flyer programs, wherein customers earn certain number of

points from every purchase. These points can be subsequently redeemed for rewards. Other forms of

loyalty reward programs include punch cards, multi-tier rewards, and cashback rewards. In a punch

card reward program, a customer can avail a reward after making some fixed number of purchases

from the merchant – for instance redeeming one free car wash on getting ten car washes from the

same merchant. In a multi-tier reward program, the perceived value of rewards increases non-linearly

as a customer’s rewards point status with the merchant increases from bronze to silver to gold –

this non-linearity in rewards is what incentivizes members of a program to become more loyal to

the merchant1. Cashback reward programs allow customers to redeem a fixed percentage dollar

cashback based on the total spending (s)he makes with the merchant subject to some minimum

expenditure constraints.

Over time, many stand-alone loyalty programs have agglomerated into larger coalition programs

which allow customers to earn and redeem points across di↵erent merchant partners participating

in the coalition. The observed coalition networks are surprisingly complex, encompassing both

pairwise partnerships like Safeway-Chevron coalition (cf. Fig. 1.1a) as well as centralized coalition

loyalty programs such as Star Alliance (cf. Fig. 1.1b) and OneWorld Alliance (international airline

alliances), Nectar (U.K.), Air Miles (Canada), Payback (Germany), Fly Buys (Australia), etc. Often

1For example, every major airline o↵ers tiered rewards with a premium status on attaining su�cient miles.

1

CHAPTER 1. INTRODUCTION 2

(a) Safeway Chevron

Partnership

(b) Airline Members of Star Al-

liance

(c) Partners of United Airlines

Figure 1.1: Some Examples of Coalition Loyalty Programs.

a merchant is part of multiple such networks leading to coalitions which are complex combinations

of centralized as well as pairwise partnerships. For instance, United Airlines, in addition to being

part of the Star Alliance, has pairwise partnerships with local airlines and merchants (cf. Fig. 1.1c).

These coalition loyalty programs are agreements between merchants allowing their customers

to exchange reward points from one merchant to another at agreed upon exchange rates. For

instance – customers carrying points from Marriot Hotel’s reward program can convert them into

miles from United Airlines at an exchange rate of one to four, i.e., 10,000 Marriot points convert

into 2,500 United miles. Such an exchange, though appears to be unrewarding for the customer,

is often beneficial – the customer might be carrying currency from United Airlines and the above

mentioned exchange might enable an immediate redemption of a free flight from United Airlines.

Usually, a high threshold of points is required to redeem worthwhile rewards from merchants, and

thereby an exchange as mentioned above incentivizes a customer to convert currencies earned at

merchants (s)he visits infrequently into currencies of merchants (s)he visits more frequently. That

is, customers often choose one merchant to accumulate currency with, and convert points they earn

at other merchants into their preferred merchant’s currency. Thus, the frequency of these currency

exchanges depend on which merchants are chosen by customers for currency accumulation, how

frequently customers visit di↵erent merchants, and what di↵erent exchange rates are agreed upon

between pairs of merchants. An interesting thing to note here is that even if two merchants have

no direct partnership, a customer could still convert points between them via a path – for instance

Hawaiian Airlines is not part of Star Alliance but has a partnership with United Airlines which

is a part of Star Alliance (as shown in Fig. 1.1b and Fig. 1.1c). Hence a customer can e↵ectively

convert miles earned on Hawaiian Airlines into miles with any Star Alliance member like Lufthansa

Airlines. A great variety is observed in the exchange rates between di↵erent merchant partners

and such exchanges of points lead to transfer of liabilities which need to be frequently settled.

Setting up these programs requires negotiations between di↵erent participating business entities.

Though coalition loyalty programs are very popular and well studied in the literature (Capizzi and

Ferguson [2005], Cao et al. [2015], Lazzarini [2007]), there is little formal understanding of the


structure and strategic formation of such networks, specifically the exchange rates between di↵erent

merchants. In Chapter 2, we introduce a model to understand the inter-partner utilities that arise in

coalition loyalty programs and study strategic network formation of such coalitions. We use pairwise

bargaining between merchant pairs as a tool for negotiating partnerships over the network and show

that this process is e�cient – the network structures obtained theoretically cohere with empirically

observed data, and the social welfare obtained is close to optimum for practically relevant scenarios.

As mentioned above, the currency exchanges initiated by customers within coalition loyalty

programs introduce liabilities between merchants which frequently need to be settled. Merchants

settle these liabilities as follows: they set a mile to dollar value and keep track of the number of

miles owed within a fixed time frame, which is usually one fiscal year. They then settle accounts by

paying each other the respective mile to dollar value times the number of miles owed. This settlement

process is ine�cient and risky. First, it requires a centralized currency like U.S. dollars. Second, as

we discuss in the next chapter, the negotiation of the payment parameters is a complicated process.

And third, merchants often change the speed at which they roll out currency to their customers –

for instance, a customer might receive 50,000 bonus miles from United Airlines sometime during the

year for being a “valuable” customer. This could lead to inflation of that merchant’s currency. But

this change is not reflected in the already decided exchange rate and the mile to dollar value for

settling transferred liabilities, and could thereby lead to tensions between the partenering merchants.

In Chapter 3, we propose an alternate framework for dynamic settlement of transferred liabilities

in a coalition loyalty program: merchants commit to accepting each others’ rewards points up to a

limit at prespecified exchange rates. We refer to these committments as IOUs2. Customers exchange

reward points between merchant pairs along paths of su�cient IOUs. We refer to such exchange of

points as transactions. Transactions are routed via paths of maximum exchange rate to minimize

conversion loss. They are also allowed to occur occur between merchant pairs who do not directly

commit to accepting each others’ currency through trusted intermediaries. Past transactions are

accounted for by introducing reverse IOUs which are e↵ectively promises to allow future transactions

to settle already transferred liabilities. Since the credit limits imposed are in respective currencies

of participating merchants, no central currency like U.S. dollars is required for settlements. Mutual

credit limits between two merchants imposes the following restriction: if the flow of points between

two merchants does not balance out and replenish the credits, the system stops allowing conversion

of points between them along the direction of depleted credit, causing losses to the merchant who is

not able to reverse the transferred liabilities. Additionally, we show two additional properties of the

system. First, transactions never lead to any new arbitrage opportunities in the system, though they

could create new IOUs. And second that the state of the system is independent of the paths chosen

for transactions as long as the paths maximize the exchange rate: i.e., nodes are not incentivized to

demand payments to act as an intermediary for transactions. In short, we introduce a decentralized

2IOU stands for “I Owe You” - Definition from Investopedia: An informal document that acknowledges a debtowed, and this debt does not necessarily involve a monetary value as it can also involve physical products


model for settling transactions between merchants participating in a coalition loyalty program, and

show properties of the system which make it a real viable alternative to the credit settlement process.

Finally, in Chapter 4 we investigate the scenarios for when o↵ering a loyalty reward program is

better for a merchant as opposed to traditional pricing. We look into the design of frequency reward

programs, where customers earn points as currency over spendings with merchants, and are able to

redeem these points into dollar valued rewards after attaining some threshold point collections. We

consider a competitive duopoly where one merchant o↵ers a frequency reward program and the other

o↵ers traditional pricing with discounts, and characterize a novel model of customer choice where

customers measure their utilities in rational economic terms. We assume two kinds of customers:

myopic and strategic (Yilmaz et al. [2016]). In addition, we assume that every customer has a

prior loyalty bias (Fader and Schmittlein [1993]) toward the reward program merchant, a parameter

drawn from a known distribution indicating an additional probability of choosing the reward program

merchant over the traditional pricing merchant. This bias increases the switching costs (Klemperer

[1995]) of strategic customers until a tipping point and we show that customer behavior exhibits a

phase transition: (s)he is not incentivized to visit the reward program merchant before attaining

a certain number of points from it, and after that (s)he strictly prefers making purchases from

the reward program merchant. These behavioral patterns cohere with the emprically observed and

psyhologically motivated customer choice dynamics in past literature, thereby validating our model

(Kivetz et al. [2006], Dreze and Nunes [2004], Gao et al. [2014]). Moreover, we assume that the

traditional pricing merchant is non-strategic, and characterize the optimal reward design choice

for the merchant o↵ering the frequency reward program, specifically, how should the merchant

decide the optimal point thresholds and dollar value of rewards to optimize for its revenue share

from the participating customer population. We impose an important constraint: the merchant

has to o↵er a one design fits all reward program for the entire participating customer population,

and is not allowed to personalize the program for di↵erent customer segments. We show that the

optimal reward design parameters for maximizing the expected revenue for the reward program

merchant are independent of the above mentioned aspects of the customer population. We refer to

the reward program as being e↵ective if the revenue objective of the merchant is better than that

of the traditional pricing merchant, and better than the revenue objective it could have earned by

not running any reward program. We show that for the reward program to be e↵ective, a minimum

fraction of the participating customer population needs to be strategic. And correspondingly, there

is an optimal range of the loyalty bias distribution parameter. If the bias is high, the reward program

creates loss in revenues, as customers e↵ectively gain rewards for “free”, whereas a low value of bias

leads to loss in market share to the competing merchant. In short, if a merchant can estimate the

customer population parameters, our framework and results provide theoretical guarantees on the

pros and cons of running a reward program against traditional pricing.

This is how the remaining thesis is structured. In Chapter 2, we study strategic network formation


of coalition loyalty programs. In Chapter 3, we formulate a novel model for settling credits in

coalition loyalty program. And in Chapter 4, we investigate the design of optimal frequency reward

programs. Finally we conclude and discuss future directions in Chapter 5.

Chapter 2

Network Formation of Coalition

Loyalty Programs

In this chapter, we look into the network formation problem in coalition loyalty programs. The

exchange of points between di↵erent merchants initiated by customers create additional values and

costs to the merchants: value in the form of attracting new customers, and cost in the form of

losing customers to potential competitors. Such exchanges also transfer point liabilities between

merchant pairs, and hence require mutual settlement via payments. The exchange rates and payment

parameters are negotiated by the merchants through some bargaining process and lead to formation

of networks which are surprisingly complex (cf. Fig. 2.1). We model this as a network bargaining

problem. We show that merchants resorting to pairwise Nash Bargaining produce network structures

identical to those obtained by maximizing social welfare. Moreover, we show that there is no welfare

loss by such pairwise bargaining if there no competition, and the loss is small in a large class of

practically motivated scenarios. This property is attractive since pairwise negotiations are much

more cost-e↵ective and easier to implement than centralized solutions. Our theoretical predictions

cohere with empirical observations we make about such networks.

Modeling Coalition Loyalty Programs: A network of loyalty programs can be viewed as a

weighted directed graph, with nodes corresponding to merchants, and an edge from a merchant A

to merchant B with weight rAB corresponding to an agreement via which customers can convert 1

point issued by merchant B into rAB points issued by merchant A. We henceforth refer to A as the

source node and B as the sink node of this edge1 and points issued by A as A-points and points

issued by B as B-points.

We aim to understand the structure of these coalition programs. We do so by studying a strategic

1These exchange rates are public, and many of them can be collected from the service: http://www.webflyer.com.

6

CHAPTER 2. NETWORK FORMATION OF COALITION LOYALTY PROGRAMS 7

Figure 2.1: Extended Partners of Star Alliance Members: Shown are 4 Star Alliance members(Aeroplan, Avianca, Lufthansa, and Egypt Air) and their partnerships with local airline and hotelchains in addition to mutual partnerships between each other. An edge between two nodes indicatethat the two merchants allow mutual exchange of reward points. The local partners of two di↵erentairlines usually do not form any mutual partnerships.

network formation game. Our model incorporates the following critical aspects of these networks:

• The non-linear nature of rewards, which encourages customers to convert all points into their

‘home program’.

• Presence of an edge (A,B) increases demand for B’s services by A’s loyal customers, thereby

increasing B’s revenues.

• Conversely, A may incur a cost due to lost sales, in particular, if B is a competitor.

• A higher exchange rate rAB leads to higher demand at B from A’s customers, as it earns them

a higher number of A-points. Moreover, if there are multiple (possibly multi-hop) paths between

A and B, then A’s loyal customers will use paths with the highest product of exchange rates to

maximize the number of A-points they receive.

• Loyalty points are also a source of liability for the issuing merchant, as they can be redeemed in

the future by the customers (Chun et al. [2015]). Hence by permitting the formation of the edge

(A,B), A makes a commitment to accept a share of B’s liability, for which B needs to compensate

A.

Finally, a critical operational aspect of these networks is that they are often formed via negotiations


between merchants, and the resulting contracts can not be easily modified. Moreover, in the absence

of a central agency, these contracts are usually negotiated bilaterally between merchant pairs, which

has a much lower setup cost than forming a centralized coalition.

Our Contributions: We study strategic network formation of coalition loyalty programs, under

a model which incorporates all the above aspects. In Section 2.1, we present the model, and fully

characterize the Nash Bargaining solution for two merchants. We also show that Nash Bargaining

maximizes the social welfare in the two merchant setting.

In Section 2.2, we conduct an empirical investigation of some existing coalition networks and

observe that even in the presence of a network the direct relationship between two merchants is

very closely associated with the degree of competition between them: higher the competition, lower

the mutual exchange rates. We argue that extending our model to multiple merchants and using

pairwise Nash Bargaining as a network bargaining solution produces networks that are structurally

identical to the ones observed empirically: complete K-partite graphs emerge, where the merchants

within each partition are competitors, but across partitions they do not compete.

Finally, in Section 2.3, we characterize the gap between the optimal social welfare and the

welfare obtained by pairwise Nash Bargaining over the network. We first show via a counterexample

that bilateral Nash Bargaining does not maximize social welfare, thereby indicating that centralized

coordination, though complicated to conduct, is useful in some settings. And, under mild conditions,

we show that the sub-optimality in social welfare under bilateral Nash Bargaining is small. A

particularly interesting case is where the merchants are completely heterogeneous and mutually

non-competing. In this case, it turns out that bilateral Nash Bargaining does maximize the social

welfare, and a complete directed graph emerges as the solution. We extend this result to the case

where merchants are less heterogenous and semi-competitive.

In a nutshell, our results suggest that pairwise Nash Bargaining over the network leads to similar

structures as observed empirically, and achieves optimal or near-optimal welfare, for practically

relevant situations.

Related Work: The management and impact of loyalty programs is well studied (Cao et al.

[2015], Taylor and Neslin [2005]) – however, the literature on coalition loyalty programs is primarily

empirical (Flores-Fillol and Moner-Colonques [2007]). Lazzarini [2007] provides a survey of the

evolving structures of airline coalitions, suggesting a shift from bilateral ties to more connected and

complex structures over time.

Strategic network formation models have been used to study friendships in social networks (Jack-

son and Wolinsky [1996]), labor markets (Calvo-Armengol and Jackson [2007]), etc. – see Jackson

[2005] for an overview of such models. Our work is closest in spirit to the models of directed network

formation with unilateral decision making (Bala and Goyal [2000]).

Our work is also tangentially related to the literature on credit networks (DeFigueiredo and Barr


[2005]). In these networks, various authors have studied liquidity (Dandekar et al. [2011]), strategic

formation (Dandekar et al. [2015]), and credit updating (Resnick and Sami [2009]). Loyalty program

networks have features which are quite distinct from credit networks, in particular, the notion of

liability and the presence of exchange rates. To the best of our knowledge, our work is the first

attempt at formalizing the strategic aspects of these networks.

2.1 Bilateral Negotiation Model

We start by considering the case of two merchants, A and B, who both run individual loyalty

programs with their own loyal customer base, and are trying to negotiate a coalition loyalty program.

Let us first understand why A and B may both benefit from having a joint program. Suppose

that A’s loyal customers (whom we henceforth refer to as type-A customers) occasionally want to

avail services from B – arguably they are more likely to go to B for these services if the points that

they earn from B (henceforth, B-points) upon purchase can be converted back to A-points. This

excess demand is clearly beneficial to B – it not only brings in immediate revenues, but possibly

future revenues from type-A customers preferring B over its competitors. Moreover, the likelihood

of type-A customers bringing business to B will in general increase with the exchange rate rAB ,

i.e., the number of A-points earned by exchanging 1 B-point. Hence, higher the exchange rate, the

better it is for B in this respect.

On the other hand, by allowing B-point to A-point conversions, A is taking on B’s liability of

providing value against the points accumulated by a type-A customer using B. The higher the

conversion rate, the higher is the volume of conversions, and thus higher is this liability. Further,

every visit of A’s customer to B results in some lost sales (current/future), in particular, if B is a

competitor. This can be viewed as a cost for A, which is high if A and B are competitors, and low

if they o↵er complementary services. Moreover, it is natural for B to compensate A for taking on

its liability. For similar reasons, A may also benefit if B allows conversions of A-points to B-points.

The exact exchange rates and payments made for each point converted are decided via a negotiation

between A and B.

Without loss of generality we can assume that the points in the two reward programs are nor-

malized, so that both A and B bear a cost of p dollars for each point redeemed by a customer.

Suppose that the exchange rates are rAB and rBA. One concern in any exchange network is the

presence of arbitrage opportunities – to prevent this, we require the product of rAB and rBA to be

not more than 1 (since otherwise customers can increase their points unboundedly for free). In fact,

in most cases, the observed exchange rates between loyalty programs are usually less than 1. Hence

we impose a stronger requirement that all rates lie in [0, 1] 2.

2Qualitatively, our results do not change, if we impose the weaker condition that product of pairwise exchangerates are bounded by 1


Next, we assume that the rate at which points are converted from B to A, by type-A customers

buying from B, is an a�ne function of rAB : ✓ABrAB + �AB (and similarly, we define ✓BArBA+ �BA

to be the rate of flow of points in the opposite direction). ✓AB + �AB can be thought of as the rate

at which type-A customers visit B and exchange B-points for A-points under the highest possible

exchange-rate (i.e., 1), while �AB is the rate at which type-A customers visit B when the exchange

rate is 0. Note that this quantity allows us to abstract away the pricing by di↵erent merchants as

well as their schemes to grant points to their customers. For simplicity we let � values to be 0,

although again our qualitative results remain unchanged if we relax this assumption.

We define cAB 2 [0, 1] be the competitiveness between the two merchants: value 1 refers to

the two merchants directly competing and o↵ering substituble services, and value 0 refers to them

o↵ering complementary services. Note that competitiveness is symmetric, i.e., cAB = cBA. Next, let

ai � 0 be the value that merchant i (i 2 {A,B}) obtains per point earned and converted by a type-j

customer (j 2 {B,A}). As mentioned earlier, this value can be attributed to customer acquisition

and viewed as a combination of immediate revenues due to the purchase, as well as future revenues

due to the possibility that the customer may prefer i over its competitors in the future. On the

other hand, we define the cost perceived by i for every point converted from i to j as aj ⇥ cij , i.e., if

the two merchants are direct competitors, the perceived cost of losing a customer by one merchant

is same as the perceived value of gaining the customer by the other merchant. Finally, we define qji

to be the payment made by j to i for every point conversion initiated by a type-i customer from j

to i.

To summarize: for i 2 {A,B}, and j 6= i, we have:

– Rate of points-transfer from j to i: ✓ijrij

– Payment received by i for the above transfer: ✓ijrijqji

– Perceived cost of i for the above transfer: ✓ijrijajcij

– Cost of i for redeeming transferred liabilities: ✓ijrij · rijp– Rate of points-transfer from i to j: ✓jirji

– Value received by i due to the above transfer: ✓jirjiaj

– Payment made by i for the above transfer: ✓jirjiqij

Using the above, the utility derived by i = A,B is:

ui = �✓ijr2

ijp� ✓ijrijajcij + ✓ijrijqji + ✓jirji(ai � qij). (2.1)

And the social welfare, (i.e., the sum of utilities) is:

ui + uj = ✓ijrij(aj � cijaj � rijp) + ✓jirji(ai � cjiai � rjip)

= ✓ijrij(aj(1� cij)� rijp) + ✓jirji(ai(1� cji)� rjip) (2.2)

The quantity aj ⇥ (1 � cij) is the di↵erence between the benefit that merchant j gets, and the


perceived cost to merchant i, due to a type-i customer visiting j. This is the additional social welfare

in the two merchant network per transferred liability from j to i. Note that this additional welfare

is maximized when the two merchants are complementary (cij = 0). Intuitively this means that

the two merchants o↵ering complementary services can combine their reward programs by o↵ering

relevant exchange rates to minimize the outflow of their customers to other competitors.

Solution Concept: We use Nash Bargaining as a solution concept to resolve bilateral negotia-

tions between merchants. Under this, the directed exchange rates and the payments are chosen to

maximize the product of the net utilities of the two merchants. The following result characterizes

the Nash Bargaining solution in this setting:

Theorem 2.1.1. Under the Nash Bargaining solution, we have for i, j = A,B,

1. rij = min{aj(1�cij)2p , 1}.

2. qij and qji are any values that satisfy:

✓ijrij(aj � 2qji + ajcij + rijp) = ✓jirji(ai � 2qij + aicji + rjip)

Moreover, these parameters also maximize the social welfare.

Proof. First observe the utility values of both merchants from Eq. 2.1 is:

ui = �✓ijr2

ijp� ✓ijrijajcij + ✓ijrijqji + ✓jirji(ai � qij)

uj = �✓jir2

jip� ✓jirjiaicji + ✓jirjiqij + ✓ijrij(aj � qji)

Threat point is still (0, 0), and hence under Nash Bargaining uiuj is maximized. This implies

that the derivative of uiuj w.r.t. qij and qji is 0 as these parameters are unconstrained. And for

parameters being rij or rji, if the derivative of uiuj is positive within the constraints of the exchange

rates (i.e., [0, 1]), then the value is maximized at the exchange rate equal to 1, otherwise, we find

the exchange rate by setting the derivative to 0. Observe that for x being any of the parameters:

@uiuj

@x= uj

@ui

@x+ ui

@uj

@x= 0

Now we’ll calculate @ui

@x and @uj

@x for each x:

@ui

@qij= �✓jirji

@uj

@qij= ✓jirji


The above three equations immediatly imply ui = uj . Now observe the following:

@ui

@rij= ✓ij(qji � ajcij � 2rijp)

@ui

@rji= ✓ji(ai � qij)

@uj

@rji= ✓ji(qij � aicji � 2rjip)

@uj

@rij= ✓ij(aj � qji)

From the above equations we get the following:

@uiuj

@rij= ui(✓ij(aj(1� cij)� 2rijp)) (2.3)

Now since aj(1 � cij) is non negative and the exchange rate rij is bounded by 1 we get the

maximizing rate as min{aj(1�cij)2p , 1}. The same result as above hold for rji. Also, equating ui and

uj , we get the following:

✓ijrij(aj � 2qji + ajcij + rijp) = ✓jirji(ai � 2qij + aicji + rjip)

It is easy to observe that one solution to the above equation is qji = (aj(1 + cij) + rijp)/2 and

qij = (ai(1 + cji) + rjip)/2, and we can substitute the appropriate values of rij and rji for the

di↵erent cases.

Showing that these parameters maximize social welfare is straightforward. First, we observe

from Eq. 2.2 that the social welfare is independent of the payment parameters as they cancel out.

Taking the derivative of the sum of the utilities in Eq 2.2 gives the same values of exchange rates as

obtained above.

Remarks: Intuitively the above result on exchange rates means the following: if the value to

acquiring a customer is negative, then the edge is not formed; if the value is positive then the

exchange rate on the edge is directly proportional to the acquisition value and negatively dependent

on the complementarity, being bounded by the maximum limit of 1.

Also, it is not true in general that the Nash bargaining solution maximizes social welfare, but it

is in our case. To see why, note that the social welfare, which is the sum of the utilities of the two

merchants, is independent of the payments qij . Further for any fixed value of social welfare, these

payments can be designed in such a way that the two utilities are equal to each other, and equal

to half the social welfare. thereby maximizing the product of the utilities for a given social welfare.


Thus, in order to maximize the product of the utilities, it is necessary that the sum of the utilities

is maximized.

We presented one possible set of payments in the proof. When the exchange rates are evaluated

to be less than 1, this set of payments reduces to the following: substituting rij = aj(1�cij)2p , and

setting the term inside the settlement equation to 0, gives qji = aj(3�cij

4

) and qij = ai(3�cji

4

).

However, there are other solutions as well. For instance, if ✓ijrijaj = ✓jirjiai, then qij = qji = 0

is a solution as well. In this case, the reciprocal benefits exactly match for the two merchants and

hence no payments are needed. In fact one can show that there is always a solution where one of

qij or qji is 0, i.e., payments are unidirectional.

2.2 Empirical Investigation

Figure 2.2: Exchange Rates within the Star Alliance Program:

First, we look into the case of Star Alliance - a coalition of popular airlines across the world

including Aeroplan, Asiana Airlines, Thai Air, Egypt Air, Indian Airlines, and many others. Star

Alliance acts as a central moderator to set the exchange rates between di↵erent airlines to allow

conversion of miles. Figure 2.2 shows some of these exchange rate relationships - nodes indicate the


airline partners and an edge in clockwise direction from a partner A to B indicates that B’s miles

can be converted into A miles at the exchange rate specified on the edge. The airline partners shown

in the graph are mostly non-competitive as they belong to di↵erent countries. We collected this data

by crawling websites of di↵erent airlines. We make few immediate observations - first, the graph

is almost a complete graph (with the exception of Air India and Avianca Airlines) with di↵erent

exchange rates between airline pairs. And second, for any airline, the incoming exchange rates, i.e.,

the exchange rates to convert that airline points into other airlines, are close to each other.

Figure 2.3: Partnerships Across Multiple Merchants: Nodes represent the merchants and an undi-rected edge between two nodes indicates that they mutually allow exchange of points. Data collectedby scraping http://www.webflyer.com. The labels on the nodes are the codeshare merchant ab-breviations – for instance, the sixth node from top left side is US01 which stands for U.S. Airways;the eigth and ninth nodes from the top left are AM and AC which respectively stand for Amtrakand Aeroplan Canada. Similarly one of the nodes in the middle portion is MC which stands forAmerican Express. And the third node from top right is HY 01 which stands for Hyatt Hotels.

Second, we look into a coalition of many more merchant partners, that seems to have emerged

out of bilateral negotiations between merchants. Figure 2.3 shows the exchange rate relationships

between many di↵erent merchants who run their standalone reward programs. These exchange rates

have been obtained by scraping http://www.webflyer.com. This website has a list of publicly

visible exchange rate relationships between merchants. The nodes in Figure 2.3 indicate merchants,


and an edge between two nodes indicates that they allow exchange of reward points between each

other. The graph shows three clear partitions: the one on the left are airlines, in the middle are

credit card companies, and one on the right are hotel chains. There seems to be no edges within

partitions and across partitions the graph appears to be almost complete. Nodes within a partition

seem to be competitors, and have no partnerships, whereas nodes across partitions seem to be

complementary. We make the following hypothesis: direct competitors do not form a partnership,

whereas complementary merchants do. But some merchants appear to be outliers in the figure. For

instance, the node labelled as AC, which stands for Aeroplan Canada, is connected to some nodes

on the left partition in addition to being connected to all nodes in the other two partitions. Aeroplan

Canada, being a Canadian airline, does not directly compete with many airlines in the left partition,

as most of them are U.S. based. Thus, the partnerships seem to have some strong relationship with

the complementarity between the merchant pairs. Note that we obtained a similar result using our

model in the previous section via Nash Bargaining between pairs of merchants – the exchange rate

was directly proportional to the complementarity (one minus competitiveness). Hence, extending

our model to multiple merchants and using Nash Bargaining between all pairs of merchants without

considering the outside network would create identical structures to the ones observed empirically.

We extend our model along the lines of the above hypotheses and show that the bargaining

methodology we propose is not too far from the optimal in terms of the overall social welfare.

2.3 Model Extensions

We extend the model to more than two merchants negotiating a joint reward program. This involves

deciding the directed exchange rates between every pair of merchants, and the payments made by

the merchants to each other to compensate for the additional liabilities, just like we saw in the case

of two merchants. But in this case a new feature of the problem arises. Suppose there are three

merchants A, B, and C. If the direct exchange rate between two of them, say A and C, is lower than

the “indirect” exchange rate obtained by first converting C-points into B-points and then converting

those into A-points, i.e., rAC < rAB ⇥ rBC , then the direct exchange rate rAC is rendered defunct,

since no customer will use it to convert points.

Not giving due consideration to this fact during the multi-party negotiation can have severe e↵ect

on the social welfare. For instance, this could happen if all the merchants set their exchange rates

by resorting to pairwise bilateral negotiations, without considering the externality imposed by the

decisions of others. Consider the example below.

Example: Let A and C be two competing airlines that operate between the same cities. We

capture this by having cAC = 1. Let B be a hotel, which naturally does not compete with either

A or C. We capture this by having cAB = cBA = 0 and cCB = cBC = 0. Further suppose that


aA = aB = aC = a. Since A and C are direct competitors, and p > 0, there is no additional

cumulative welfare if some agents convert points between them. In this case, value is actually

lost because of services provisioned against the points converted. Hence ideally, the exchange rates

between them should be 0. This may not be possible because of the existence of the indirect exchange

rate through B (since having a joint program with B may be welfare improving). But if A and B,

and C and B undergo bilateral negotiations without considering this e↵ect of their decisions, the

resulting indirect exchange rate between A and C (through B) would typically be higher than the

one that optimizes the welfare. We can show that in this case, the worst case ratio of the optimal

social welfare and the welfare under bilateral Nash Bargaining is 1.58, if the maximum rate of points-

transfer (✓) is the same between every merchant pair (Appendix 2.5.1). In fact, this worst case ratio

holds true even for a large class of similar situations.

Let there be K classes of merchants, such that within each class the merchants are competitors,

while across classes the merchants are non-competing. Assume that aj = a for all merchants. For

any two merchants i and j within a class, suppose that cij = cji = 1. For any two merchants i and

j belonging to di↵erent classes, suppose that cij = cji = 0. And assume that the maximum rate of

points-transfer (✓) is the same between every merchant pair. Then we have the following result:

Proposition 1. For the above mentioned model, the worst case ratio of the optimal social welfare

and the welfare obtained using bilateral Nash Bargaining is 1.58.

We conjecture that 1.58 is still an upper bound to the ratio of welfare values in the more general

model where aj = a, where is the class of the merchant, and for any two merchants i and j

within a class , we have cij = cji = 1. Moreover, in this more general model, we can show that

the social welfare maximizing and pairwise bilateral Nash bargaining solutions lead to structurally

identical graphs: the exchange rates are zero between every merchant pair within the same class,

and they are non-zero across merchants belonging to di↵erent classes. We validate these results with

our empirical observations (cf. Fig. 2.3).

Non-Competing Merchants: We now show that in the case where all merchants are non-

competing, i.e., cij = cji = 0 for all i and j, the above issue does not arise, and somewhat sur-

prisingly, pairwise bilateral Nash Bargaining leads to the social welfare maximizing outcome. The

main point underlying this result is that if the exchange rates are set by bilateral Nash Bargaining

when the merchants are non-competing, then for a customer, the direct exchange rate between any

two merchants is at least as any indirect exchange rate.

Claim 1. Let for any merchant pair, the exchange rates between them be set according to Theo-

rem 2.1.1. Then for any three merchants i, j and k, rik � rij ⇥ rjk.

We leave the proof of the above claim to the appendix (Appendix 2.5.3). Observe that The-

orem 2.1.1 implies that bilateral Nash Bargaining maximizes the sum of the utilities of all the


merchants, assuming that only the direct exchange rates will used by the customers (in this case

the social welfare maximization problem decomposes across the di↵erent pairs of merchants). But

if the customers are assumed to use the minimum exchange rate path between any two merchants,

then the maximum sum of utilities of the merchants can only be lower. But the above claim implies

that the solution obtained through bilateral Nash Bargaining naturally has the property that the

direct exchange rates are higher than the indirect ones between every merchant pair. Summarizing,

we thus have the following result:

Theorem 2.3.1. Suppose that merchants are non-competitive. Then exchange rates chosen by

pairwise bilateral Nash Bargaining between di↵erent pairs of merchants maximize the social welfare.

The above result indicates that pairwise negotiations are equivalent to centrally coordinated

optimal solution, and this solution is a complete graph, where the exchange rate depends only on

the destination merchant, when merchants are non-competing. We extend the above result to the

case when the merchants are less heterogenous and semi-competitive.

Semi-Competing Merchants: We now show that even if there is competition between mer-

chants, but the variance in competitiveness is not too high, then under mild conditions the social

welfare is still maximized by using Nash Bargaining between pairs of merchants. Specifically we

define the following two properties:

1. Semi-competitiveness: We say merchants within a network are semi-competitive if for any

three merchants i, j, and k the following holds:

1� cik � (1� cij) · (1� cjk)

Note that the above is equivalent to saying that the variance in competitiveness is not too

high.

2. Low acquisition value: The acquisition value aj for all merchants j is bounded above by twice

the servicing cost 2p. This condition adds some level of homogeneity to all the merchants.

Claim 2. Let all merchants be semi-competitive and have low acquisition values, and the exchange

rates between them be set according to Theorem 2.1.1. Then for any three merchants i, j, and k,

rik � rij ⇥ rjk.

Proof. Between any two merchants i and j, the exchange rate set is rij = min{aj(1�cij)2p , 1}. But

given the conditions, aj(1�cij)2p < 1. Thus rij = aj(1�cij)

2p . Then it is easy to see that for any three

merchants i, j, and k, the following holds:


1� cik � (1� cij) · (1� cjk)

=) ak(1� cik)

2p� (1� cij) ·

ak(1� cjk)

2p

=) rik � (1� cij) · rjk

=) rik � aj(1� cij)

2p· rjk

=) rik � rij · rjk

Following the above claim, the Theorem 2.3.1 easily extends to this case as well and we get the

following:

Corollary 2.3.1. Suppose that merchants are semi-competitive and have low acquisition values.

Then exchange rates chosen by pairwise bilateral Nash Bargaining between di↵erent pairs of mer-

chants maximize the social welfare.

2.4 Conclusions

Wemodeled the inter partner utilities that arise in coalition loyalty programs and used Nash Bargain-

ing between merchant pairs as a tool for negotiating the exchange rate and payment parameters. We

completely characterize the two merchant setting and show that the bargained solution has exchange

rates proportional to the complementarity between the merchants and that the solution maximizes

social welfare. Our model and results validate the empirical hypotheses: complete K-partite graphs

emerge where merchants within a partition are competitors and merchants across partitions are

non-competing. Finally, we extend the model to multiple merchants and show that pairwise Nash

Bargaining still achieves optimal or near optimal social welfare: if merchants are non-competing or

semi-competitive, the social welfare obtained is optimal; and if there is competition between some

merchants, then for a wide class of network structures of practical interest, the ratio between the

optimal welfare and the social welfare obtained is bounded above by a small quantity 1.58.

In short, pairwise Nash Bargaining is a powerful tool for conducting negotiations for coalition

formation in loyalty programs.


2.5 Appendix

2.5.1 Analysis of Social Welfare Gap Example

Since the maximum rate of points-transfer is the same between every merchant pair, we can ignore

it for calculating the ratio of social welfares. We also assume a 2p. Let us analyze the optimal

social welfare and the social welfare obtained via bilateral Nash Bargaining. First, observe the social

welfare (ignoring ✓):

� r2ABp+ rABa� r2BCp+ rBCa� (rABrBC)2p (2.4)

Nash Bargaining Solution

Under bilateral Nash Bargaining, we get the exchange rates as rAB = rBC = a2p 1 and no edge

from i to k (Theorem 2.1.1). For simplicity let a2p = t. The social welfare in this case is:

�a2

4p + a2

2p � a2

4p + a2

2p � a4

16p3

= a2

2p � a4

16p3

= 2pt2(1� t2/2) (2.5)

Optimal Welfare

We can argue that since there is symmetry, hence rAB = rBC to maximize optimal social welfare.

Let these two exchange rates be r. Then the social welfare is:

2(�r2p+ ra)� r4p = p(�2r2 + 4rt� r4) (2.6)

The above value is maximized at the value of r where the derivative of the function w.r.t. r is 0.

The derivative is p(�4r3 � 4r + 4t). On substituting the root of this derivative as the value of r in

the optimal welfare, and maximizing the ratio of optimal welfare and the Nash Bargaining solution

over t from 0 to 1, we find that the value is maximized at t = 1. The value of r obtained at t = 1 is

around 0.68 and the maximized ratio value is around 1.58.

2.5.2 Proof of Proposition 1

First, we prove the proposition for 2 classes of merchants A and B, and then extending the result to

K classes is fairly straightforward. Let there be m merchants in A and n in B. The proof precedes

very similar to the above simplified example.

Nash Bargaining Solution

Clearly the pariwise bilateral Nash Bargaining solution creates edges between merchant pairs across

A and B with exchange rate r = a2p (Theorem 2.1.1). This is assuming a is no more than 2p. Also

pairwise Nash Bargaining does not create an edge between merchant pairs within any partition. Let


a2p = t 1. Consider three merchants A, B, and C, such that A and C are in the same partition

and B is in the other patition, same as the above example. Overall social welfare is just the welfare

obtained by these three merchants times the number of such merchant triplet combinations across

the two partitions. The welfare obtained between these three merchants can be calculated exactly

like above. Thus the value is:

(m(m� 1)n+ n(n� 1)m)⇥ 2pt2(1� t2/2)

= mn(m+ n� 2)⇥ 2pt2(1� t2/2) (2.7)

Optimal Welfare

Again because of symmetry, we can argue that exchange rates between any merchant pairs across the

two partitions are the same. Let this quantity be r. And within any partition, having an exchange

rate greater than r2 will only hurt welfare. And having an exchange rate less than r2 will never

be used. Thus, there are no edges between merchant pairs within any partition. Again like the

preceeding argument, the overall social welfare can be written as:

(m(n� 1)n+ n(n� 1)m)⇥ p(�2r2 + 4rt� r4)

= mn(m+ n� 2)⇥ p(�2r2 + 4rt� r4) (2.8)

It is easy to see that the ratio of welfares is exactly the same as that in the preceeding example,

and hence the maximum value is the same 1.58.

2.5.3 Proof of Theorem 2.3.1

As we argued, the proof of the theorem easily follows after proving Claim 1. We just prove the claim

here.

Proof. We first show that for any three nodes i, j, and k, the following holds:

rik � rij ⇥ rjk (2.9)

Observe that rik = rjk = min{ak

2p , 1} (Theorem 2.1.1). And rij 1 by definition. Hence, the above

equation always holds.

Hence, the exchange rate along any directed edge is the maximum among all paths between those

two merchants. Thus all transactions happen via direct edges, and no transactions happen along

paths of length more than 1. Now the proof of the theorem easily follows.

Chapter 3

Credit Settlement in Coalition

Loyalty Programs

In this chapter we introduce an alternate model for settling transferred liabilities between merchants

participating within a coalition loyalty program.

Overview of our Model: We propose an alternate model for credit settlement in coalition loyalty

programs based on an abstraction of credit networks (Karlan et al. [2009], Ghosh et al. [2007], De-

Figueiredo and Barr [2005]). Credit networks are a versatile abstraction for modeling trust between

entities in a network. In our model, merchants participating in a coalition loyalty program, unilater-

ally commit to accepting each others’ reward points up to a limit (which we call the credit capacity)

at a specified exchange rate. Customers induce exchange of reward points between merchants as

discussed in the previous chapter. We refer to these exchanges of reward points as transactions in

the network of merchants. These transactions happen along paths with positive credit capacities

and maximum exchange rates to minimize conversion loss: i.e. a conversion of some merchant u’s

points into some other merchant v’s points can be facilitated only if there is a path with su�cient

credit capacity from v to u in the network, and the transaction occurs along the path that minimizes

the conversion loss. Suppose a merchant u commits to accepting up to c points issued by a merchant

v at an exchange rate of r. Say some customer transaction induces v to convert x v-points (where

x < c) for u-points; it gets xr u-points in return. The credit extended by u to v depletes by x, and v

promises to redeem x v-points to u in return for xr u-points at a future time. This is represented as

an IOU which can be used to allow for future customer transactions (we use the convention of rep-

resenting credit capacity and exchanges rate along an edge in the currency of the target merchant).

Merchants that do not directly extend credit to each other can still exchange points via trusted

intermediaries. Thus, the credit network acts as a decentralized ledger that obviates the need for a

21

CHAPTER 3. CREDIT SETTLEMENT IN COALITION LOYALTY PROGRAMS 22

central entity to keep accounts.

Hence, transactions change credit capacities, and also introduce credit on reverse-edges to keep

track of IOUs. The exchange rates on reverse-edges are reciprocals of those on forward-edges, and

thus, they could be significantly greater than 1. The state of the system is the set of credit capacities

along all edges including reverse-edges (initially the credit capacity along each reverse edge is 0).

We say that the system is in a no-arbitrage state if the product of exchange rates along any directed

cycle with positive credit capacity is not more than 1.

i

j

k

cij ; rij

cji; rji

cjk; rjk

ckj ; rkj

(a) Before the transaction

i

j

k

(cij � x · rjk); rij

cji; rji

(cjk � x); rjk

ckj ; rkj

(x · rijrjk); 1

rij(x · rjk); 1

rjk

(b) After the transaction

Figure 3.1: Illustrative Example of Dynamic IOU Settlement

Figure 3.1 illustrates our model and how transactions change the state of the system. There are

three merchants: i, j, and k. Merchant j extends a credit of cji i-points to i at an exchange rate of

rji, and cjk k-points to merchant k at an exchange rate of rjk; similarly merchant i extends credit to

j, and k extendts to j. We assume that there is no arbitrage in the starting system, i.e., the product

of exchange rates along any directed cycle is not more than 1. Now suppose that some customer

transaction induces merchant k to convert x k-points into i-points. The maximum exchange rate

path in the network from i to k is chosen for the transaction and k receives x · (rijrjk) i-points. Thecredit extended by j to k depletes by x. Similarly, the credit extended by i to j depletes by x · rjk.IOUs are introduced as shown by dotted edges in Fig. 3.1(b). This means that merchant k promises

to give x k-points to j in return of x ·rjk j-points. Similarly, j promises to give x ·rjk j-points to i in

return of x · (rijrjk) i-points. Say at a future time, another customer transaction induces merchant

j to convert x · rjk j-points to k-points. The maximum exchange rate path is via the dotted green

edge from k to j as the no arbitrage assumption implies 1

rjk� rkj , and thus merchant j receives x

j-points in return.


Our Contributions: We formally establish two essential robustness properties of the proposed

framework:

1. No-arbitrage: Allowing dynamic settlement of IOUs by balancing existing IOUs with future

transactions never introduces any arbitrage in the system.

2. Non-concentration of market power : The state of the system is independent of the paths chosen

for the transactions as long as they are routed along the paths with maximum exchange rate.

This implies that no node is incentivized to demand money to act as an intermediary, and thus

does not acquires su�cient ‘market-power’ even if it facilitates many transactions.

These two properties are essential to establish our model as a viable alternative to the credit

settlement process in coalition loyalty programs. This is how the remaining chapter is structured.

We first discuss some related work. And then, in Section 3.1 we formally introduce our model.

In Section 3.2 we show our theory results on no-arbitrage and non-concentration of market power.

Finally, Section 3.3 we conclude with some future work and open problems.

Related Work: Credit networks were originally introduced in thee parallel papers for modeling

trust between entities (DeFigueiredo and Barr [2005], Ghosh et al. [2007], Karlan et al. [2009]).

Credit networks have some immediate structural advantages: they are secure against malicious users,

they allow a decentralized formation leading to an organic growth of the system, and they allow

interactions between entities not directly related to each other. Dandekar et al. [2011] introduced a

decentralized payment infrastructure as an extension of credit networks and in a later paper studied

strategic formation of such networks (Dandekar et al. [2015]). Their model allowed nodes to print

their own currency and issue trust to each other to allow transactions along paths of su�cient trust.

They studied the problem of liquidity in such networks – showing that for various dense graph

topologies, the steady state probability of transactions failing under symmetric transaction regimes

was low. Our model is a direct extension to theirs, but with the addition of exchange rates across

every trust relationship issued between entities.

3.1 Model

We represent merchants by nodes in a directed multigraph G = (V,E). For every edge e = hi, j, riin E, let ce represent the credit capacity along the edge e: i promises to accept up to ce points from

j at the pre specified exchange rate of r along this edge. We represent x points from a merchant j

by the quantity xj . A transaction t is denoted by the tuple hi, j, xi, where i sends xi to j. These

transactions can happen along paths in the network. We refer to a transaction t = hi, j, xi along an

edge e = hj, i, ri as an edge-transaction ht, ei which is feasible only if ce is at least as large as x. If

ht, ei occurs feasibly, i gets in return xr j-points, and the credit capacity on the edge e decreases


by x. At a future time, j can request a reverse favor, i.e., exchange xr j-points to get back these

x i-points. This reverse favor is represented by the reverse-edge e = hi, j, 1

r i with credit capacity

xr. We use the notation e to represent the reverse-edge of e. Note that the exchange rates on the

reverse-edges are not visible to customers, but are only there to settle IOUs among the merchants.

Transactions change the credit capacities along edges, and hence change the state of the net-

work. A state is simply the set of credit capacities along all edges in the network, and we de-

note the credit capacity on an edge e in state S as ce(S). The network G can be viewed as

the initial state, which we assume to have no arbitrage. Given a state S of the credit network,

a transaction t = huk, u1

, xi from merchant uk to merchant u1

is feasible along a path P =

{hu1

, u2

, r1

i, hu2

, u3

, r2

i, . . . , huk�1

, uk, rk�1

i} from u1

to uk, if each edge in P has adequate credit

capacity in terms of u-points, i.e., for all ej = huj , uj+1

, rji in P , cej (S) � xuk =⇣x ·

Qk�1

i=j+1

ri

⌘uj+1

.

Routing a feasible transaction t along path P in state S results in a state S0 given by

ce(S0) =

8>><

>>:

ce(S)� x ·Qk�1

i=j+1

ri, if e = ej 2 P

ce(S) + x ·Qk�1

i=j ri, if e = ej 2 P

ce(S), otherwise

(3.1)

Note that, since merchants unilaterally decide to extend credit to each other, there may already

exist an edge e0j = huj+1

, uj , r0ji from uj+1

to uj for all 1 j k�1. Therefore, a credit network may

have up to two edges from some merchant u to another merchant v. Routing a feasible transaction

only a↵ects the total credit extended to the payer (merchant uk) and the payee (merchant u1

); the

total credit extended to all other merchants remains unchanged. Also note that for each edge e and

any two states S and S0 of the network,

ce(S) + ce(S)/re = ce(S0) + ce(S

0)/re (3.2)

For a transaction t = huk, u1

, xi and a path

P = {hu1

, u2

, r1

i, hu2

, u3

, r2

i, . . . , huk�1

, uk, rk�1

i}, we refer to the tuple ht, P i as a path-transaction.

Observe that routing a path-transaction ht, P i is equivalent to routing a sequence of (k � 1) edge-

transactions, htj , eji, 1 j k � 1, where tj := huj+1

, uj , x ·Qk�1

i=j+1

rii, and ej = huj , uj+1

, rji.For a path P from a merchant u to a merchant v in the state S, we define cP (S) := sup{x : x >

0, t = hv, u, xi is feasible along P} as the capacity of path P in state S. cP (S) is the maximum

payment in v-points that can be routed along P in S. We use Puv(S) to denote the set of paths

from u to v in state S and P to denote the reverse path of path P . We define the exchange rate

along P as rP :=Q

e2P re, where re is the exchange rate along edge e in P .

We route transactions along maximum exchange rate paths. More specifically, a transaction

t = hv, u, xi in state S is routed according to the following recursive procedure: Consider a maximum

exchange rate path P ⇤ in Puv(S). If x cP⇤(S), we route the path transaction ht, P ⇤i and we

are done. Otherwise, we route t⇤ := hv, u, cP⇤(S)i along the path P ⇤ resulting in state S0. Let


x0 := x � cP⇤(S). We try to route the residual transaction t0 := hv, u, x0i in state S0 recursively. If

any of the successive residual transactions fail, we rollback all transactions to restore state S.

3.2 Results

Theorem 3.2.1. (No Arbitrage): Starting from a no-arbitrage state S, all states reachable through

feasible transactions are arbitrage-free.

Proof. We prove this using contradiction. Assume, at some point of time, a transaction creates an

arbitrage, i.e., a cycle C along which product of exchange rates is greater than 1. Let this transaction

Figure 3.2: Transaction from u to v Leading to Cycle C

route along a path P . Since this transaction creates the arbitrage-cycle C, it must interact with

the cycle C, i.e., C \ P 6= �. Consider the last edge e = hu, v, ri 2 C that gets changed due to this

transaction. Then C would look like as shown in figure 3.2(a). That is there are two paths from

node u to v, one is the edge e and the other with overall exchange rate R. The existence of this

path is guaranteed by the formation of the directed cycle C after the transaction is routed. Since

the transaction chooses the edge e instead of the other path from u to v, we get r � R. Thus after

the transaction is routed, e is created, if it did not exist previously, and we get the directed cycle

C as shown in Figure 3.2(b). The overall exchange rate along this cycle C is R · 1

r = Rr . Now since

this is an arbitrage cycle, we get Rr > 1 =) R > r which is a contradiction. ⇤

Corollary 3.2.1. Any state reachable from a no-arbitrage state S through feasible transactions can

have at most three edges between any two merchants u and v; two from u to v, and 1 from v to u,

or vice versa.

Proof. We prove this by contradiction. Assume there exist 4 (the maximum possible) exchange rates

between u and v. Then the possible exchange rates would be r1

, 1

r2from u to v, and r

2

, 1

r1from v

to u (r1

6= r2

, r1

6= 1

r2). Clearly either r

1

· r2

or 1

r1·r2 is greater than 1, contradicting Theorem 3.2.1.

⇤


Definition 1. We call a cycle C as ‘loss free’ if the product of exchange rates along C is 1. We say

a state S0 is reachable via loss free cycles from a state S if there exist loss free cycles with appropriate

transaction amounts that lead to the transition, and we denote this as SlC=) S0.

We define state equivalence under this notion of reachability through loss free cycles and represent

equivalence of two states S and S0 as SE= S0. Thus reachability via loss free cycles partitions the

set of states into equivalence classes. It is easy to see that if any state in an equivalence class

is arbitrage-free then all states in that equivalence class are arbitrage-free, as all the states in

the class are reachable to each other via loss free cycles. We call such an equivalence class as

arbitrage-free equivalence class. We use this as a notion of equivalence and show Theorem 3.2.3 on

path-independence. We set up some more notations first.

Definition 2. For any node u in state S we denote NOUTu (S) as the out-neighbors of u, N IN

u (S)

as the in-neighbors of u, EOUTu (S) as the out-edges of u, and EIN

u (S) as the in-edges of u.

Definition 3. We establish the following notation for a node u in state S:

dOUTu (S) =

Pe2EOUT

u (S)

ce · re: total amount of credit u vests on its neighbors in u’s currency.

dINu (S) =P

e2EINu (S)

ce: total amount of credit neighbors of u vest on it in u’s currency.

dOUT (S) and dIN (S) denote the the vector of dOUTu (S) and dINu (S) for all nodes u in the graph and

are equivalent definitions for Generalized Score Vector as in Dandekar et al. [2011]. We refer to both

dOUT (S) and dIN (S) as d(S) or d-vectors for a state S. The following observations (Obs. 1 to 4)

follow easily.

Observation 1. A ht, P i path transaction does not change the d-vectors of any nodes in the graph

except the source and sink of P .

Observation 2. A ht, Ci transaction along a loss free cycle C does not change the d-vectors of any

nodes in the graph.

Observation 3. If a state S transitions to S0 via a transaction t = hv, u, xi along an edge e =

hu, v, ri, then e = hv, u, 1/ri, ce(S0) � xr, and 1/r = argmax{r(P ) : P 2 Pvu(S0)}.

Observation 4. If a state S0 is reachable from a state S via a transaction t along a loss free cycle

C, then S is reachable from S0 via the transaction t along the reverse loss free cycle C.

Next we show the equivalence theorem to characterize a simpler condition for state equivalence.

This is a technical theorem and helps us with the proof of Theorem 3.2.3, not o↵ering any direct

insights. We leave the proof to the appendix.

Theorem 3.2.2. For any two arbitrage-free states S and S0 of the network the following statements

are equivalent


1. S and S0 belong to the same equivalence class.

2. S0 is reachable from S via a combination of loss free cycles.

3. d-vectors are same along each node in S and S0

Theorem 3.2.3. (Path Independence): Starting from any state in an arbitrage free equivalence

class, irrespective of which paths are chosen for a set of successive transactions, as long as the paths

are exchange rate maximizing, the final equivalence class reached is the same.

Proof. We first show Lemma 1, that transactions induce transitions across equivalence classes: ir-

respective of the choice of state in a fixed equivalence class E1

, any feasible transaction always

transitions the state into the same equivalence class E2

. Now we use a simple inductive argument

over Lemma 1 to prove the theorem. Let some set of k transactions in a state S belonging to an

equivalence class E be feasible along two sets of maxmimum exchange rate paths {P1

, . . . , Pk} and

{P 01

, . . . , P 0k}. Let the intermediate equivalence classes reached in the two cases be {E

1

, . . . , Ek} and

{E01

, . . . , E0k} respectively. Applying Lemma 1 once shows E

1

= E01

. Applying it recursively we get

Ek = E0k.

Lemma 1. Fix two arbitrage-free states S and S0 of the same network which belong to the same

equivalence class. Let t = hv, u, xi be a transaction. If t transitions S to S and S0 to S0, then SE= S0.

Proof. Let P1

= argmin{rP : P 2 Puv(S)} and P2

= argmin{rP : P 2 Puv(S0)}. We first show

that rP1 = rP2 . Since SE= S0 we get from Theorem 3.2.2 that S

lC=) S0. We show the following

proposition.

Proposition 2. If S0 is reachable from S via a single transaction along a loss free cycle C then

rP1 = rP2 .

Proof. We prove this using contradiction. Assume rP1 6= rP2 . Assume w.l.o.g. rP1 > rP2 . First

observe that C \P1

6= � as otherwise cP1(S0) > 0 which contradicts the maximality of exchange rate

along P2

in S0. Let u2

be the first node of P1

along C and v2

be the last node of P1

along C. Let

P be the sub-path of P1

from u2

to v2

. Let P 0 be the sub-path of C from u2

to v2

. Then rP � rP 0

as P1

is the maximum exchange rate path in S. Let P 00 be the sub-path of C from v2

to u2

. Then

1 = rC = rP 0 · rP 00 rP · rP 00 (3.3)

But P+P 00 is also a cycle, and the no-arbitrage theorem (Theorem 3.2.1) immediately gives rP ·rP 00 1. This implies that in Eq. 3.3 equality holds throughout. Thus 1 = rP · rP 00 = rP 0 · rP 00 which

implies rP = rP 0 = 1

rP 00. In S0, the reverse loss free cycle C exists as S0 is reachable from S via C

(Obs. 4). Thus P 00, i.e., the reverse path of P 00, is a path in S0 from u2

to v2

with exchange rate

r = 1

rP 00= rP . Now it is easy to observe that P

3

= P1

\P + P 00 2 Pu1v1(S0) and rP3 =

rP1rP

· r = rP1 .


Since rP1 > rP2 we get rP3 > rP2 which contradicts the maximality of exchange rate along P2

in S0.

⇤

Now let S0 be reachable from S via single transactions along k loss free cycles {C1

, . . . , Ck} with

intermediary states {S1

, . . . , Sk�1

}. Let P 0i = argmin{rP : P 2 Puv(Si)}. Then Prop. 2 implies

rP1 = rP 01= rP 0

2= . . . = rP 0

k�1= rP2 . Hence we have shown rP1 = rP2 . Now we use this fact to

show the following claim.

Claim 3. If x min{cP1(S), cP2(S0)}, then S

E= S0.

Proof. In S, P1

is a valid choice of path for the transaction t, and since cP1(S) � x, thus it is

also su�cient. Let S1

be the state reached by transacting t along P1

in S. If t goes through

any other choice of paths, all of them would have same exchange rate as P1

due to maximality of

exchange rate along P1

in S. Thus, in S, loss free cycles will be formed for every path other than

P1

used. Then it is easy to see that SlC=) S

1

, which implies SE= S

1

. Similarly, S0 is equivalent

to a state S2

reached by a ht, P2

i transaction in S0. We will show d(S1

) = d(S2

) which would

imply SE= S0 by Theorem 3.2.2 and associativity of equivalence. First, since the d-vectors are

not changed for intermediate nodes between the source and sink (Obs. 1), we get for all nodes w

except u and v, dw(S1

) = dw(S2

). Now dINv (S1

) = dINv (S) � x = dINv (S0) � x = dINv (S2

). And

dOUTv (S

1

) = dOUTv (S)+x = dOUT

v (S0)+x = dOUTv (S

2

). Thus dv(S1

) = dv(S2

). And since rP1 = rP2 ,

we get dINu (S1

) = dINu (S)+x·rP1 = dINu (S)+x·rP2 = dINu (S2

). And dOUTu (S

1

) = dOUTu (S)�x·rP1 =

dOUTu (S)� x · rP2 = dOUT

u (S2

). Thus d(S1

) = d(S2

). ⇤

If x > min{cP1(S), cP2(S0)}, let x

1

= min{cP1(S), cP2(S0)}. Let t0

1

= hv, u, x1

i. Let routing t01

in

S and S0 bring the network to states S1

and S01

respectively. Then the above claim shows S1

E= S0

1

.

Let t1

= hv, u, x � x1

i. Thus we have strictly reduced the transaction amount (x) by routing a

smaller transaction t01

, and the states reached are equivalent to each other. Thus we end with the

same starting condition of state equivalence, and we can continue this procedure. Hence a simple

induction shows that t will be completely routed in a finite number of steps according to the above

procedure and SE= S0.

3.3 Conclusions

We extend the model introduced by Dandekar et al. [2011] to allow arbitrary exchange rates between

entities and proposed an alternative system for settling credits between members of a coalition loyalty

program using it. Our system is decentralized, does not need any centralized currency for settlements,

and thereby, is easy to grow and distribute. The system allows transactions betweem merchants

who may not have a direct partnership. Additionally, it is secure against malicious merchants, as

any malicious merchant can only create risks to its direct partners only. We showed two essential


properties that make our model a viable alternative: first, we show that the system is secure against

the creation of arbitrage opportunities, and second, that intermediaries to transactions do not have

incentives to demand payments for transactions through them, as long as alternate paths with the

same exchange rates exist. A critical question here is whether the IOUs in credit networks can always

settle via future transactions? Does the system ensure su�cient liquidity for transactions under some

model of customer transactions? Dandekar et al. [2011] showed liquidity under a simple transaction

regime for the setting with unit exchange rates between nodes. Extending it to arbitrary exchange

rates can provide useful insights on how customer transactions influence the need of renegotiation

of exchange rates within coalition loyalty programs. And what exchange rates, depending on the

frequency of customer transactions, lead to long term liquidity and stability of the network.

3.4 Appendix

3.4.1 Proof of Theorem 3.2.2

We show that for any two states S and S0 the following holds:

SlC=) S0 () d(S) = d(S0)

One side of this relation is straightforward, SlC=) S0 =) d(S) = d(S0) as transactions along loss

free cycles do not change the d-vectors (Obs. 2). To prove the reverse side we first introduce some

more notation and show two lemmas, Lemma 2 and Lemma 3. Then the result follows easily. We

will use Eq. 3.2 a number of times in the proof, so we restate it here.

ce(S) + ce(S)/re = ce(S0) + ce(S

0)/re (3.4)

Definition 4. For two states S and S0 of the network, we denote G0(S, S) as the di↵erence graph

of S and S0 with edge capacities defined as follows:

ce(G0(S, S0)) =

8<

:ce(S)� ce(S0), if ce(S)� ce(S0) > 0

0, otherwise

Let S0 be reached from S via n edge-transactions {t1

, .., tn} transitioning through intermediate

states {S1

, .., Sn�1

}. Let G0(Si) = G0(S, Si) for all 1 i n � 1 and G0(S0) = G0(S, S0). The

following observations follow easily from the definition.

Observation 5. Let S and S0 be two states of the same network. For any edge e if ce(G0(S, S0)) > 0

then ce(S0) > 0.


Observation 6. If ti (i n) is the last transaction routed along an edge e, then ce(G0(Si)) �ce(G0(S0)).

Lemma 2. dOUTu (S0) = dOUT

u (S) + dINu (G0(S0))� dOUTu (G0(S0)) for all u 2 G0(S0).

Proof. Using Eq. 3.4 and the definition of G0 it is easy to verify the following:

ce(S0) =

8<

:ce(S)� ce(G0(S0)), if ce(G0(S0)) > 0

ce(S) + ce(G0(S0))/re, otherwise

Thus

dOUTu (S0) =

X

e2EOUTu (S)

ce(S0)⇥ re

=X

e2EOUTu (S);ce(G0

(S0))>0

(ce(S)� ce(G0(S0)))⇥ re +

X

e2EOUTu (S);ce(G0

(S0))=0

(ce(S) + ce(G0(S0))/re)⇥ re

=X

e2EOUTu (S)

ce(S)⇥ re +X

e2EOUTu (S);ce(G0

(S0))=0

ce(G0(S0))�

X

e2EOUTu (S);ce(G0

(S0))>0

ce(G0(S0))⇥ re

=X

e2EOUTu (S)

ce(S)⇥ re +X

e2EINu (G0

(S0))

ce(G0(S0))�

X

e2EOUTu (G0

(S0))

ce(G0(S0))⇥ re

= dOUTu (S) + dINu (G0(S0))� dOUT

u (G0(S0))

Lemma 3. For any pair of nodes u, v there cannot exist 2 edges from u to v and 1 path from v to

u simultaneously in G0(S0) with non zero credit capacity.

Proof. Assume contrary, i.e., there exist distinct edges e1

, e2

from u to v and a path P from v to u

with ce1(G0(S0)) > 0, ce2(G

0(S0)) > 0, cP (G0(S0)) > 0. Assume w.l.o.g. re1 > re2 . Since ce1(G0(S0)),

ce2(G0(S0)), and cP (G0(S0)) > 0, some transactions must be routed along e

1

, e2

, and along each

edge of P . Let ti(i n) be the last edge-transaction along e1

, e2

, or some edge on P , and the state

where this transaction takes place is Si�1

. We first show the following proposition:

Proposition 3. Let P, P 0 2 Puv(S0) be two edge disjoint paths and rP > rP 0 . Then cP 0(G0(S0)) > 0

implies cP (S0) = 0.

Proof. We prove this by contradiction. Assume cP (S0) > 0. Now cP 0(G0(S0)) > 0 implies that some

of the transactions must have routed along edges in P 0. Let ti be the last transaction routed along

P 0 and let ti route via some edge e1

2 P 0. Then ce0(G0(Si�1

)) � ce0(G0(S0)) > 0 for all e0 2 P 0 \{e1

}(Obs. 6). Obs. 5 implies ce0(Si�1

) > 0 for all e0 2 P 0 \ {e1

}. Since the ti transaction routes via edge

e1

2 P 0, thus e1

is the cheapest path for ti. Now we state and prove a helpful claim.


Claim 4. Let S be a state, and P, P 0 2 Pu1v1(S) be two edge disjoint paths with rP > rP 0 . Let

e = hu2

, v2

, ri 2 P 0 be some edge and t = hv2

, u2

, xi be some transaction. If e is the cheapest path

for t and ce0(S) > 0 for all e0 2 P 0 \ {e}, then cP (S) = 0.

Proof. We prove this by contradiction. Assume cP (S) > 0. Let P1

be the sub-path of P 0 from u2

to u1

, and, P2

be the sub-path of P 0 from v1

to v2

. Since ce0(S) > 0 for all e0 2 P 0 \ {e}, thuscP1(S) > 0 and cP2(S) > 0. Thus there exists a path from u

2

to v2

along P1

followed by P followed

by P2

. Exchange rate along this path is rP1 ⇥ rP ⇥ rP2 = rP⇥rerP 0

> re. Thus this path is cheaper

than e for t. Hence, we have a contradiction.

Continuing onto the proof of our proposition, the above claim implies cP (Si�1

) = 0. Now since

cP (S0) > 0, there must be a transaction after ti that routes along some edge on P . Let tj be the last

such transaction, and let it route along an edge e2

2 P . Thus we get ce0(G0(Sj�1

)) � ce0(G0(S0)) > 0

for all e0 2 P \{e2

} (Obs. 6). Obs. 5 implies ce0(Sj�1

) > 0 for all e0 2 P \{e2

}. Since the tj transactionroutes via edge e

2

2 P , the above claim implies cP 0(Sj) = 0.

Since cP 0(S0) > 0, there must be a transaction after tj that routes along some edge on P 0. But

ti was assumed as the last transaction along any edge on P 0, and tj occured after ti, hence we have

a contradiction.

Continuing onto the proof of the lemma, we consider the following three cases:

1. ti is along e1

implies ce1(Si�1

) > 0.

ce2(G0) > 0 =) ce2(G

0(Si�1

)) > 0 (Obs. 6).

Since re1 > re2 and both e1

and e2

are from u to v, we get a contradiction from Prop. 3.

2. ti is along e2

implies ce2(Si�1

) > 0.

cP (G0) > 0 =) cP (G0(Si�1

)) > 0 (Obs. 6).

cP (G0(Si�1

)) > 0 =) c¯P (Si�1

) > 0 (Obs. 5)

Since e2

is used for the tth transaction thus re2 > r¯P . Again this contradicts Prop. 3.

3. ti is routed along an edge e0 = hu0, v0, r0i 2 P implies ce0(Si�1

) > 0. For any edge e = e2

or

e 2 P \ e0 we get

ce(G0) > 0 =) ce(G

0(Si�1

)) > 0 (Obs. 6) =) ce(Si�1

) > 0 (Obs. 5)

Let P = P \ {e0}+ e2

. P is a path from u0 to v0 and cP (Si�1

) > 0 (from above observation).

Since e0 is used for the ith transaction we get r(e0) > r(P ). And since ce0(G0(Si�1

)) > 0 and

cP (Si�1

) > 0, we again get a contradiction for Prop. 3.


Now we continue onto the proof of the main theorem. We show that G0(S0) can be decomposed

into a combination of loss free cycles which immediately implies SlC=) S0 given that the d-vectors

are same in both S and S0.

For the sake of notational convenice, we denote G0(S0) by G0. First observe that since G0 is the

di↵erence graph, thus dOUTu (G0) = dINu (G0) for all u 2 G0. Lemma 3 clearly applies. We prove this

by contradiction. Assume G0 cannot be decomposed into a combination of loss free cycles. Let G00

be the graph obtained after removing all possible loss free cycles from G0. Observe removing loss free

cycles preserves Lemma 2. Also removal of loss free cycles leads to equal change in dOUT and dIN ,

thus, dOUTu (G00) = dINu (G00) for all u 2 G00. Observe that any source or a sink node cannot satisfy this

property, hence, G00 cannot contain any source or a sink node. Let P be a maximal set of contiguous

edges, each having non zero credit capacity along the same direction in G00. P must be a cycle, since

G00 cannot have any source or sink nodes. Also Lemma 3 shows that there cannot be two edges

between any two nodes u, v on P , since, along the cycle, there already exists a path from v to u. Now

any node along P does not have any incoming or outgoing edges except those along P , as otherwise

it contradicts the maximality of P . Thus we get for each edge e = hu, v, rei 2 P , dOUTu (G00) = ce⇥re

and dINv (G00) = ce. Thus dOUTu (G00)/re = dINv (G00) for each e = hu, v, rei 2 P . Since dOUT

u (G00) =

dINu (G00) for all u 2 G00, we get dOUTu (G00)/re = dOUT

v (G00) for each e = hu, v, rei 2 P . Taking

product over all edges in P we get for any node u 2 P , dOUTu (G00) = dOUT

u (G00) ·Q

e2P re =)Q

e2P re = 1, which means P is loss free, which is a contradiction. This completes our proof of the

equivalence theorem.

Chapter 4

Optimal Design of a Frequency

Reward Program

In this chapter we investigate the problem of designing a frequency reward program for a merchant

against a traditional pricing merchant. There is extant literature on characterizing customer behav-

ior toward frequency reward programs. Most of the literature is empirical in nature, and relies on

psychological behavioral patterns among customers, as opposed to rational economic decision mak-

ing. We consider a competitive duopoly of two merchants where one merchant o↵ers a loyalty reward

program and the other o↵ers traditional pricing with discounts and characterize a novel model of

customer choice where customers measure their utilities in rational economic terms. In addition,

we characterize the optimal reward design choice for the merchant o↵ering the frequency reward

program, based on di↵erent customer populations: specifically, how should the merchant decide the

optimal reward redemption thresholds and dollar value of rewards to optimize for its revenue share

from the participating customer population.

This is how the remaining of the chapter is structured. First we will describe some past work.

Then we will go over our contributions and explain how our work builds on top of past literature.

In Section 4.1 we will describe our model followed by the main results in Section 4.2. We will follow

up with a short discussion and future work in Section 4.3.

Related Work: Three popular psychological constructs have been used to explain customer choice

dynamics toward reward programs – Goal Gradient Hypothesis, Medium Maximization, and Tipping

Point Dynamics. Kivetz et al. [2006] conducted an empirical study observing an acceleration in the

number of purchases by customers as they approached the reward, i.e., as customers accumulated

reward points to reach closer to achieving the reward, their e↵ort invested toward gaining more

points increased. The authors attributed this behavior to Goal Gradient Hypothesis (Hull [1932]).

33

CHAPTER 4. OPTIMAL DESIGN OF A FREQUENCY REWARD PROGRAM 34

This behavior is also very prevalent in online badge systems, such as those on Stackoverflow; re-

cently, mathematical models relying on rational user behavior have been developed that explain this

phenomenon (Anderson et al. [2013]). Stourm et al. [2015], Dreze and Nunes [2004] observed that

customers often stockpiled reward points even when there were economic incentives against the col-

lection of points. They attributed this behavior to Medium Maximization – customers often treated

collecting reward points as a goal itself just like collecting stamps as opposed to connecting reward

points with economic incentives. Correspondingly, they introduced a new model where customers

had di↵erent “mental accounts” and utility functions for points and cash. Gao et al. [2014] observed

via experimentation that customers often collect reward points for exogenous reasons until they

accumulate a threshold amount, after which they start investing e↵ort toward the collection process

itself. That is, customers build up switching costs (Klemperer [1995]) before fully adopting the

reward program, and sometimes this switching cost is created due to reasons exogenous to rational

economic incentives. They referred to this behavior as the Tipping Point E↵ect.

A large body of literature investigates the switching costs customers face within a competitive

duopoly framework – see Villas-Boas [2015] for a short survey. Our model is closest in spirit to

that of Hartmann and Viard [2008] and Kopalle and Neslin [2001]. Both papers are empirical in

nature and model a competitive duopoly where customers maximize their long term discounted

utility. Hartmann and Viard [2008] argue that less frequent buyers face higher switching costs as

they are more likely to be a↵ected by reward redemption deadlines, whereas frequent buyers redeem

rewards easily and do not face substantial switching costs. They do not model how customers build

up switching costs, but only argue what happens when customers are close to achieving a reward.

Kopalle and Neslin [2001] discuss dynamic competition between two merchants deciding whether

to o↵er a reward program or traditional pricing and model this decision problem as a two stage

game: first merchants decide whether to o↵er a reward program or traditional pricing and then

they decide their prices. Using simulations, depending on customer parameters in the model, they

characterize the conditions for when it is better to o↵er a reward program versus traditional pricing.

We on the other hand model a multi-period problem where the customer behavior is characterized

using a complete dynamic program, and mathematically analyze our model. We make two modeling

assumptions: first is an exogenous visit probability bias toward the reward program merchant which

can be attributed to excess loyalty – customers often build up higher brand preference toward

the merchant o↵ering a reward program (Fader and Schmittlein [1993], Sharp and Sharp [1997]);

and second, a look-ahead factor for customers, which indicates how far into the future customers

can perceive the rewards (Liu [2007], Lewis [2004]). Our results on customer choice dynamics

intuitively look similar to some of those obtained in the above mentioned body of literature. But

more importantly, we model and optimize the revenue objective of the merchant, characterizing an

optimal reward program design for maximizing expected revenue.

Our Contributions:


Model Overview

We model a competitive duopoly of two merchants, one of them o↵ering a frequency reward program

and the other o↵ering traditional pricing. Both merchants sell an identical good at fixed precom-

mitted prices. The reward program merchant sells the good at a higher price. With each purchase

from the reward program merchant, a customer gains some fixed number of points, and on achieving

the reward redemption threshold, (s)he immediately gains the reward value as a dollar cashback.

Customers measure their utilities in rational economic terms, i.e., they make their purchase

decisions to maximize long term discounted rewards. The discount factor is the time value of

money, and we assume it to be constant for all customers. We also assume that every customer

makes a purchase everyday from either of the two merchants. We relax these two assumptions by

introducing a look-ahead factor that controls how far into the future a customer can perceive the

rewards. This a↵ects the customer behavior dynamics as follows: if the reward redemption threshold

is farther than the customer’s look-ahead parameter, (s)he is unable to perceive the future value

of that reward and take it into consideration while maximizing long term utility. This parameter,

being customer specific, adds heterogeneity to both the future discounting and purchase frequency –

customers having high purchase frequency might be able to perceive rewards with higher redemption

thresholds. We only model myopic and strategic customers, i.e., the look-ahead parameter being 0

or a large value, and leave further parametrization for future work. But importantly, the framework

we develop could be applied and modiefied to more complex look-ahead distributions.

In addition, we assume each customer has a visit probability bias with which (s)he purchases

the good from the reward program merchant for reasons exogenous to utility maximization. This

behavior may be attributed to excess loyalty (Fader and Schmittlein [1993], Sharp and Sharp [1997])

which has been argued as an important parameter for the success of any reward program, or it

may be attributed to price insensitivity of customers; whenever a customer is price insensitive, (s)he

strictly prefers to purchase from the reward program merchant as (s)he gains points redeemable

for rewards in the future. There are many possible reasons for customers’ price insensitivity: the

reward program merchant could be o↵ering some other monopoly products, or the customer might

be getting reimbursed for some purchases as part of corporate perks (eg: corporate travel). As an

e↵ect, this visit probability bias controls how frequently the customer’s points increase even when

(s)he does not actively choose to make purchases from the reward program merchant. Both the

look-ahead and excess loyalty parameters can be attributed to bounded rationality of customers and

have been argued to be important factors toward customer choice dynamics, as discussed above in

the related work.

Results Overview

We formulate the customer choice dynamics as a dynamic program with the state being the number

of points collected from the reward program merchant. When the customer does not make biased


visits to the reward program merchant, (s)he compares the immediate utility of purchasing the good

at a cheaper price with the long term utility of waiting and receiving the time discounted reward to

make a purchase decision. The solution to the customer’s dynamic program gives conditions for the

existence and achievability of a phase transition: a points threshold before which the customer visits

the merchant o↵ering rewards only due to the visit probability bias, and after which (s)he adopts

the program and always visits the merchant o↵ering rewards till receiving the reward. We show that

this phase transition occurs sooner for strategic customers. Increasing the reward value also makes

the phase transition occur earlier. However, increasing the points threshold required to redeem the

reward or the price discount o↵ered by the traditional pricing merchant delays this tipping point. In

short, these results verify that our model is in coherence with the di↵erent psychological constructs

as discussed in the related work section: purchase acceleration closer to reward redemption and a

tipping point before which purchases are only due to the loyalty bias.

After characterizing the customer behavior dynamics in our model, we optimize over the long

run revenues that the reward program merchant achieves. We model a specific case of propor-

tional promotion budgeting: the reward o↵ered by the reward program merchant is proportional

to the product of the distance to the reward and the discount provided by the traditional pricing

merchant, with the proportionality constant being another parameter in the design of the reward

program. We show that under proportional promotion budgeting, the optimal distance to reward

and the proportionality budgeting constant follow an intuitive product relationship which is inde-

pendent of the customer population parameters, and these values correspond closely to real world

observed cashback percentage values. In addition, optimizing the revenue objective gives the same

optimal distance to reward as minimizing the phase transition point as defined above. Moreover, we

characterize the conditions in terms of the customer parameters for when the revenue objective of

the reward program merchant is better than the traditional pricing merchant and when it is better

for the reward program merchant to o↵er a reward versus not o↵ering any reward, for a specific

choice of loyalty bias distribution. We show that for the reward program to be e↵ective under both

the above conditions, a minimum fraction of customer population must be strategic. And there is a

specific range of values of the loyalty bias between 0 and 1 corresponding to the fraction of strategic

customers for the reward program to be strictly better for the merchant.

4.1 Model

We index the two competing merchants selling identical goods as A and B. Without loss of generality,

we assume that A sells the good for a price of 1 dollar while B sells it for 1�v dollars, i.e., B o↵ers a

discount of v dollars. A on the other hand o↵ers a reward of value R dollars to a customer after (s)he

makes k purchases at A. We investigate only the case that we refer to as “proportional promotion

budgeting” wherein this reward R is proportional to the product of the distance to the reward k


and the discount v provided by B. That is R = ↵kv where ↵ is assumed to be a constant.

4.1.1 Customer Behavior Model

We assume customers purchase the item from either A or B everyday, i.e., we ignore the heterogeneity

in frequency of purchases among the customers in our model and leave it for future work. We assume

customers have a linear homogenous utility in price: at price p the utility is ⌫(p) = 1 � p. This

reduces to customers getting an immediate utility of 0 from A and v from B. All customers have

the same time value of money as a discount factor of � lying between 0 and 1.

We denote a customer’s visit probability bias and the look-ahead parameter with � and t re-

spectively. That is with probability �, (s)he purchases from A due to externalities and perceives a

future reward only if it is within t purchases away. We assume � for a customer to be drawn from a

uniform distribution between [0, b], where b is between 0 and 1. And we focus on a simple threshold

distribution for the look-ahead parameter t:

t =

8<

:t1

, wp p,

0, wp 1� p.

The above distribution intuitively means that the customers are either myopic and focus only

on immediate rewards, or are far-sighted (we assume t1

is large). We model the customer’s decision

problem as a dynamic problem. We index the number of purchases the customer makes from A

until reward by i, for 0 i k � 1, and we refer a customer to be in state i after having made i

purchases from A. At state i, the customer has two possibilities:

1. With probability �, the customer must visit A, and (s)he is now in state i+ 1.

2. With probability 1 � �, the customer may purchase from B for an immediate utility v and

remain in state i or purchase from A for no immediate utility but move to state i+ 1.

Let V (i) denote the long term expected reward at state i. Then we model the decision problem

as the following dynamic program.

V (i) = ��V (i+ 1) + (1� �)max{v + �V (i),�V (i+ 1)} for 0 i k � 1

V (k) = R

We show that the decision process exhibits a phase transition; that is prior to some state, the

customer purchases from A only if (s)he must do so exogenously but after that state, (s)he always

decides to purchase from A. This phase transition point is independent of �, and depends only on

t, among the variable customer parameters. Hence we represent this phase transition point as i0

(t).


4.1.2 Merchant Objective

Given the above model of customer dynamics, we define the revenue objectives of A and B, where

A chooses its parameters and B is non-strategic. We define the rate of revenue for a merchant from

a customer as the expected time averaged revenue that the merchant receives within the customer’s

lifetime. For simplification, we assume merchants do not discount future revenues. As described

above, a customer’s dynamics are cyclic after each reward cycle. Thus, the lifetime dynamics of

customer behavior is a regenerative process with independent and identically distributed reward cycle

lengths. Let RoRA(c) and RoRB(c) denote the expected rate of revenues for A and B respectively

from a customer c’s lifetime. Let ⌧(t,�) denote the total number of purchases the customer makes

before reaching the phase transition point i0

(t). Then the length of the reward cycle (or total number

of purchases the customer makes before receiving the reward) is ⌧(t,�)+k� i0

(t), as after the phase

transition, (s)he makes all purchases from A only until (s)he receives the reward. In this cycle, the

number of visits that the customer makes to A are k, and to B are ⌧(t,�)� i0

(t). The revenue that

A earns in one such cycle is k�R and the revenue that B earns is (1� v)(⌧(t,�)� i0

(t)). Thus the

rate of revenues for A and B from the customer c are as follows:

RoRA(c) = E⌧

k �R

⌧(t,�) + k � i0

(t)

�

RoRB(c) = E⌧

(⌧(t,�)� i

0

(t))(1� v)

⌧(t,�) + k � i0

(t)

�

Since the process for a single customer is regenerative, using the reward renewal theorem (Cinlar

[1969]), we can take the expectation over the cycle length inside the numerator and denominator

respectively. Note that E⌧[⌧(t,�)] = i0(t)

� as before reaching the phase transition point, with proba-

bility �, the number of purchases by the customer from A increases by 1, and with probability 1��

it stays constant. Then taking the expectation over the customer population, the overall rate of

revenues for both A and B are as follows:

RoRA = E�,t

k �R

i0

(t)/�+ k � i0

(t)

�(4.1)

RoRB = E�,t

(i

0

(t)�� i0

(t))(1� v)

i0

(t)/�+ k � i0

(t)

�(4.2)

4.2 Results

4.2.1 Customer Choice Dynamics

We first show that every customer exhibits the following behavior: until (s)he reaches the phase

transition point i0

(t), she purchases from A only due to the exogeneity parameter, and after that


(s)he always purchases from A till she receives the reward. This behavior is cyclic, and repeats after

every reward redemption.

Lemma 4. V (i) is an increasing function in i if the following condition holds:

R >(1� �)v

1� �(4.3)

And further, V (i) can be evaluated as:

V (i) = max

⇢��V (i+ 1) + (1� �)v

1� (1� �)�,�V (i+ 1)

�(4.4)

Proof. Proof. First we show that V (i) is an increasing function in i by induction. We first show

that if the condition above is satisfied, V (k � 1) < V (k) = R. Suppose not, so V (k � 1) � R. Then

from Eq. ??, we have:

V (k � 1) = ��V (k) + (1� �)(v + �V (k � 1))

=��R+ (1� �)v

1� (1� �)�

<��R+ (1� �)R

1� (1� �)�

=R(1� (1� �)�)

1� (1� �)�= R

But this is a contradiction, so V (k�1) < V (k). Now assume V (i+1) < V (i+2) for some i < k�2,

we will show that this implies V (i) < V (i+ 1). Suppose not, so V (i) � V (i+ 1). As we did before

we may upper bound V (i).

V (i) = ��V (i+ 1) + (1� �)(v + �V (i))

(1� �)v + �V (i)

() V (i) (1� �)v

1� �

But because V (i+ 1) < V (i+ 2), we may lower bound V (i+ 1).

V (i+ 1) � ��V (i+ 2) + (1� �)(v + �V (i+ 1))

= (1� �)v + (1� �)�V (i+ 1) + ��V (i+ 2)

> (1� �) + �V (i+ 1)

() V (i+ 1) >(1� �)v

1� �


Again, we have a contradiction, so V (i) < V (i+1), and V (i) is an increasing function in i. Now we

prove the second claim. We have the following:

V (i) = ��V (i+ 1) + (1� �)max{v + �V (i),�V (i+ 1)}

= max{��V (i+ 1) + (1� �)(v + �V (i)),�V (i+ 1)}

Assuming V (i) is the left term in the above maximum, we may solve the equation for that term.

V (i) = ��V (i+ 1) + (1� �)(v + �V (i))

(1� (1� �)�)V (i) = ��V (i+ 1) + (1� �)v

V (i) =��V (i+ 1) + (1� �)v

1� (1� �)�

And we get our claim.

Now if the expected reward of the customer increases with the number of purchases made from

A, we expect that at some number of purchases it becomes profitable for the customer to choose

to purchase from A as opposed to B. We characterize this phase transition point in the following

theorem.

Theorem 4.2.1. Suppose V (i) is an increasing function in i and consider a customer with look-

ahead parameter t. A phase transition occurs after (s)he makes i0

(t) visits to firm A, where i0

(t) is

given by:

i0

(t) =

8<

:k �� ⌘ i

0

, if t � �.

k � t, otherwise.(4.5)

with

� =

�log�

✓v

R(1� �)

◆⌫(4.6)


Proof. Proof. First we solve for the condition on V (i+ 1) for us to choose firm A over B willingly.

�V (i+ 1) >��V (i+ 1) + (1� �)v

1� (1� �)�

() �V (i+ 1)

✓1� �

1� (1� �)�

◆>

✓1� �

1� (1� �)�

◆v

() �V (i+ 1)

✓1� (1� �)� � �

1� (1� �)�

◆>

✓1� �

1� (1� �)�

◆v

() �V (i+ 1)

✓(1� �)(1� �)

1� (1� �)�

◆>

✓1� �

1� (1� �)�

◆v

() �V (i+ 1) >v

1� �

() V (i+ 1) >v

�(1� �)

Let i0

be the minimum state i such that the above holds, so in particular V (i0

) v�(1��) but

V (i0

+ 1) > v�(1��) . We know because V is increasing in i, this point is indeed a phase transition:

V (i) > v�(1��) for all i > i

0

, so after this point, the customer always chooses firm A. We may

compute V (i0

) easily using this fact.

V (i0

) = �V (i0

+ 1) = · · · = �k�i0V (k) = �k�i0R

Thus, we have the following:

�k�i0 v

R�(1� �)< �k�(i0+1)

() k � i0

� log�

✓v

R�(1� �)

◆> k � (i

0

+ 1)

() i0

k � log�

✓v

R(1� �)

◆+ 1 < i

0

+ 1

() i0

= k ��log�

✓v

R(1� �)

◆⌫⌘ k ��

If t � �, the customer perceives the reward prior to this tipping point, so i0

(t) = i0

= k ��.

If t < �, the customer does not perceive the reward at this point, and immediately once (s)he

perceives the reward, (s)he is beyond this point and adopts the reward program, so t = k � t. The

above dependence reduces to the following after incorporating our specific look-ahead distribution:

i0

(t) =

8<

:i0

, wp p,

k, wp 1� p.


Note that the phase transition point is independent of �, the customer’s visit probability bias

toward the merchant. As we would expect, it increases with the look-ahead parameter, and with

the price discount o↵ered by merchant B. Additionally, it decreases with an increase in the reward

value (R) and a decrease in the distance to reward (k). The variation with the discount factor �

is interesting: we can show that for any Rv � 1 there exists a � 2 [0, 1] that minimizes the phase

transition point i0

for strategic customers. We refer to the ratio of number of visits required for a

forward-looking customer to adopt a reward program and the total distance to the reward as the

“influence zone”. Intuitively this is the fraction of visits that the merchant wants to influence the

customer by o↵ering exogenous means of earning additional points like bonus miles in airlines or

accelerated earnings, as discussed in the introduction. Next we find the optimal k for minimizing

this influence zone if ↵ is constant.

Remark 1. Influence zone is minimized at k = e↵(1��) under proportional promotion budgeting, as

long as � is close to 1.

Proof. Proof. As defined the influence zone is i0k = k��

k = 1 � �

k . Thus minimizing the influence

zone is equivalent to minimizing k�

.

k

�=

k

log�

⇣1

↵k(1��)

⌘ ⇠ k(1� �)

log(↵k(1� �))

The above approximation relies on � close to 1. Now this value is minimized at k = e↵(1��) .

Therefore, for all distributions of excess loyalty, the optimal value for k is given by e↵(1��) , the value

for which k�

is minimized and takes the value e↵ . At this value the influence zone takes the value

1� ↵e .

Note that if ↵ is 1, then the value of k corresponds to a cashback between 2% and 4% as � ranges

between 0.95 and 0.9. This value is realistic to what is observed in practice.

4.2.2 Merchant Objective Dynamics

Optimizing Reward Parameters

So far we have characterized the customer behavior within the duopoly without concern about the

particular reward design parameters. In this section, we derive optimal parameters for the reward

program design with the objective of maximizing the revenue of the reward program merchant.

Interestingly, we see that maximizing revenue corresponds to minimizing the influence zone, as

illustrated above.

Theorem 4.2.2. Under proportional promotion budgeting, the optimal reward distance (k) and the

optimal budget proportion (↵) for merchat A follow the relation ↵k = e(1��) for all distributions of


� as long as � is close to 1.

Proof. Proof. Let ✓ = �

k . First, we evaluate RoRA. We substitute the value of the phase transition

point obtained above in the rate of revenue equation for A to reevaluate it. And since we assume that

� and t are drawn independent of each other, we can separate the expectation terms and evaluate

them sequentially, first over t, then over �.

RoRA =E�,t

k �R

i0

(t)/�+ k � i0

(t)

�

=E�

p · k �R

i0

/�+ k � i0

+ (1� p)�(k �R)

k

�

=E�

p · �(k �R)

k�+ i0

(1� �)+ (1� p)

�(k �R)

k

�

=E�

p · �(1� ↵v)

1� ✓(1� �)+ (1� p)�(1� ↵v)

�

Observe that the term inside the expectation is maximized when ✓ is maximized for all values

of � 2 (0, 1). Using Leibniz’ Rule, we can conclude that the integral itself is maximized when ✓

is maximized, which as shown above, is equivalent to minimizing the influence zone. As shown in

Remark 1, this happens at ↵k = e1�� . And at this point, ✓ = �

k = ↵e .

An interesting point to observe above is that if ↵ is constant, then maximizing the revenue

objective is equivalent to minimizing the influence zone. This result matches the following intuition

- the faster the merchant can get customers to adopt the reward program, the more purchases they

will make with the merchant in the long run - but is stronger as it actually maximizes the revenue

objective as well. Although, reward point accelerations are common and e↵ective mechanisms to

get customers to adopt reward programs, we have shown that designing the reward program so that

a minimum number of such accelerations is required leads to maximizing merchant’s revenue. The

condition that � be close to 1 is not very restrictive, as the discount factor is expected to be high

in most cases. Note that because k � �, the above also shows ↵ e. Finally, observe that we

need R > (1��)v1�� for V to be increasing. We meet this condition with proportional budgeting when

k = e↵(1��) as R = ↵kv = ev

1�� v1�� (1��)v

1�� .

The above framework can be used for optimizing for the reward parameters to maximize A’s

rate of revenue, for varying distributions of the customer population. That is, if a merchant has a

good estimate of its customer population’s distribution, it can easily utilize the above theorem to

optimize its reward scheme. We leave the competitive study where merchant B could strategize on

its discount value v for future work.

Figure 4.1 shows how the rate of revenue of the reward program merchant varies with ↵, after

fixing ↵k as in our previous theorem, for various distributions of the loyalty bias parameter. We


(a) Uniform distribution. (b) Normal distribution.

(c) Logit-normal distribution.

Figure 4.1: Rate of revenue for reward merchant as a function of ↵ (with k = e↵(1��) ) for di↵erent

distributions. For all distributions, � = 0.9, p = 0.9 and v varies as labeled. The uniform distributionis on (0, b] with b = 0.9; the normal distribution has µ = 0.5 and � = 0.1; and the logit-normaldistribution is the standard on [0, 1].

observe three general patterns for RoRA: for large values of v, RoRA decreases along all feasible

values of ↵; for small values of v, it increases for all values of ↵; and for some values of v in between,

it is convex with a minimizing ↵ in (0, e). That is, the rate of revenue for A is maximized at ↵

being 0 or e, and no maximizer exists between 0 and e across distributions. We believe this to

be true for all important distributions, similar to what our simulations suggest. Thus, the reward

program merchant only needs to decide between not o↵ering the reward (setting ↵ as 0) or o↵ering

the highest feasible reward (setting ↵ to e) in our model. Note that the exact values of v for which

these patterns occur also depend on p and the parameters of the specific distribution for loyalty

bias. In the following subsection, we explore these conditions in detail for the uniform distribution

of loyalty bias for fixed ↵.


Revenue Comparisons

We characterize the conditions for when it is strictly better for A to o↵er a reward program for a

specific distribution of the loyalty bias parameter - when � for every customer is drawn uniformly at

random between (0, b] where b is less than 1. We will assume this distribution for the remainder of

the section. This condition boils down to two situations: first, the rate of revenue for A should be

higher than that of B and second, that the rate of revenue for A should be higher than it could have

achieved by not o↵ering the reward program at the same fixed price. First, we evaluate the expected

rates of revenue for both A and B under the optimality relation between k and ↵ mentioned above

with � being drawn from a uniform distribution.

RoRA =E�

p · �(1� ↵v)

1� ✓(1� �)+ (1� p)�(1� ↵v)

�

=pk · 1� ↵v

�·✓1� k ��

b�log

✓1 +

b�

k ��

◆◆+ (1� p)

bk(1� ↵v)

2k

=(1� ↵v)

✓pe

↵

✓1� e� ↵

b↵log

✓1 +

b↵

e� ↵

◆◆+ (1� p)

b

2

◆

RoRB =E�,t

(i

0

(t)�� i0

(t))(1� v)

i0

(t)/�+ k � i0

(t)

�

=E�

p · (i0/�� i

0

)(1� v)

i0

/�+ k � i0

+ (1� p)(k/�� k)(1� v)

k/�

�

=E�

p · i0(1� �)(1� v)

k�+ i0

(1� �)+ (1� p)(1� �)(1� v)

�

=p · i0

(1� v)

b(k � i0

)2

✓k log

✓1 +

b(k � i0

)

i0

◆� b(k � i

0

)

◆+ (1� p)(1� b

2)(1� v)

=p · i0(1� v)

k � i0

✓k

b(k � i0

)log

✓1 +

b(k � i0

)

i0

◆� 1

◆+ (1� p)(1� b

2)(1� v)

=(1� v)

✓p · e� ↵

↵

✓e

b↵log

✓1 +

b↵

e� ↵

◆� 1

◆+ (1� p)(1� b

2)

◆

=(1� v)

✓pe

↵

✓e� ↵

b↵log

✓1 +

b↵

e� ↵

◆� e� ↵

e

◆+ (1� p)(1� b

2)

◆

Observe that both the above equations have a left term and a right term. The left term is the

rate of revenue obtained from strategic customers whereas the right term is that obtained from the

myopic customers. As ↵ ranges between 0 and e, the value on the left term increases from 0 for

RoRA and decreases to 0 for RoRB . That is, by controlling the reward budget ratio, A is able to

gain the entire strategic customer base. But observe how RoRA varies with ↵: the marginal revenue

term (1� ↵v) decreases with ↵ as the merchant gives higher rewards to customers, but the market


share term increases as A gains more strategic customer base. As ↵ ! 0, RoRA ! b/2, i.e., the

revenue earned is only due to the loyalty bias, and is equivalent to the reveue earned by A when not

running any reward program.

Figure 4.2 illustrates the region in terms of the customer parameters (b, p) where it is better for

A to o↵er a reward program, i.e., RoRA > RoRB (indicated in blue) and RoRA > b2

(indicated in

yellow) for di↵erent values of ↵, keeping v = 0.05 and � = 0.95 fixed. The blue region shows that

there is a clear threshold of b and p values beyond which RoRA > RoRB . But more interestingly,

the threshold value of b and p decreases as ↵ is increased toward e. Whereas the yellow region shows

that if the fraction of strategic customers is not too small, the firm should choose to run a reward

program most of the time except for when b is large; larger b values mean that customers make more

exogenous visits, so a reward program is no longer needed to entice visits, but only decreases the

profits of the reward program merchant. The intersection of two regions, i.e., the region in green,

indicates that the range of values of b for which the reward program is strictly profitable increases

as p increases. We formally show this result next.

For any fixed ↵, the exact conditions on p, b and v for RoRA > RoRB and RoRA > b2

are rather

complex. We will first focus on one particular simple case: ↵ ! e.

Lemma 5. As ↵ ! e, RoRA > RoRB if and only if the following condition on b holds:

b > 2 ·(1� v)� p

1�p · (1� ev)

(1� v) + (1� ev)(4.7)

Proof. Proof. First we compute the following quantity.

lim↵!e

e� ↵

b↵log

✓1 +

b↵

e� ↵

◆

Let e�↵b↵ = x, then it is easy to see that the above limit is equivalent to limx!1

log(1+x)x = 0. Then

as ↵ ! e, we have the following expressions for RoRA and RoRB .

RoRA = (1� ev)

✓p+ (1� p)

b

2

◆

RoRB = (1� v)(1� p)

✓1� b

2

◆


(a) ↵ = 0.5 (b) ↵ = 1

(c) ↵ = 2 (d) ↵ = 2.5

Figure 4.2: Regions where RoRA > RoRB (blue), where RoRA > b2

(yellow) and where both aretrue (green) for di↵erent values of ↵. In all cases, � = 0.95, v = 0.05 and � drawn uniformly on(0, b].

And our condition RoRA > RoRB simplifies.

(1� ev)

✓p+ (1� p)

b

2

◆> (1� v)(1� p)

✓1� b

2

◆

b

2(1� p)(1� ev + 1� v) > (1� v)(1� p)� (1� ev)p

b > 2 ·(1� v)� p

1�p · (1� ev)

(1� v) + (1� ev)

The above lemma gives a lower bound on b for RoRA > RoRB in terms of p and v. In order for

the reward program to be strictly better than the traditional pricing model, we also need RoRA > b2

.

The following lemma shows that this condition gives a corresponding upper bound on b.


Lemma 6. As ↵ ! e, RoRA > b2

if and only if the following condition on b holds:

b <2p

p+ ev1�ev

(4.8)

Proof. Proof. The condition RoRA > b2

is equivalent to:

(1� ↵v)

✓pe

↵

✓1� e� ↵

b↵log

✓1 +

b↵

e� ↵

◆◆+ (1� p)

b

2

◆>

b

2

e

↵

✓1� e� ↵

b↵log

✓1 +

b↵

e� ↵

◆◆>

b

2p

✓1

1� ↵v� (1� p)

◆

As ↵ ! e, the left term above approaches 1 and we are left with:

b <2p(1� ev)

1� (1� p)(1� ev)

=2p(1� ev)

p� pev + ev

=2p

p+ ev1�ev

The previous two lemmas provide lower and upper bounds on b for RoRA > RoRB and RoRA >b2

, respectively. For the reward program to be strictly better than all alternatives, both of these

conditions must be met. We combine them to get an intuitive necessary and su�cient condition on

p for the reward program to be “strictly better”.

Lemma 7. As ↵ ! e, for the reward program to be strictly better on some values of b, a necessary

and su�cient condition on p is:

p > 1� 1� ev

1� ev2(4.9)

Proof. Proof. The values of b for which both previous lemmas are met is given by:

2 ·(1� v)� p

1�p · (1� ev)

(1� v) + (1� ev)< b <

2p

p+ ev1�ev

The above inequality is only valid when the lower bound is less than the upper bound. We may


Figure 4.3: The upper and lower bounds on b as a function of p. Here v = 0.05 and ↵ ! e.

manipulate this inequality to get the simple condition on p in our claim.

2 ·(1� v)� p

1�p · (1� ev)

(1� v) + (1� ev)<

2p

p+ ev1�ev✓

p+ev

1� ev

◆✓(1� v) +

p

1� p(1� ev)

◆< p(1� v + 1� ev)

(1� v)ev

1� ev< p(1� ev) +

p2

1� p(1� ev) +

p

1� pev

(1� p)(1� v)ev

1� ev< (1� p)p(1� ev) + p2(1� ev) + pev

(1� v)ev

1� ev< p

✓1 + (1� v)

ev

1� ev

◆

(1� v)ev

(1� ev) + (1� v)ev< p

ev � ev2

1� ev2< p

(1� ev2)� (1� ev)

1� ev2< p

1� 1� ev

1� ev2< p

Thus, for any choice of v, and p obeying the above condition, the combination of the above

lemmas gives an interval of b values for which the reward program is the most profitable choice for

the merchant. Figure 4.3 shows the bounds on b for varying values of p, keeping v = 0.05 fixed, and

restricting the range of b values in (0, 1). Notice that the upper bound on b increases as a function

of p while the lower bound decreases with p, so the interval of b values where the reward program is


strictly better increases with p. We formalize this observation in the next lemma.

Lemma 8. As ↵ ! e and p obeying Eq. 4.9, as p increases, the range of values of b for which the

reward program is strictly better increases.

Proof. Proof. We know that the range of b values in which we are interested is given by the interval.

2 ·(1� v)� p

1�p · (1� ev)

(1� v) + (1� ev)< b <

2p

p+ ev1�ev

Because p obeys eq. 4.9, the above inequality is valid. We will show that the above upper bound

increases with p and the lower bound decreases with increasing p. Therefore, as p increases, the

interval of b values for which the reward program grows. First consider the upper bound, UB(p) =2p

p+ ev1�ev

.

UB0(p) =ev

(1� ev)⇣p+ ev

1�ev

⌘2

� 0, 8p

Now we consider the lower bound, LB(p) = 2 · (1�v)� p1�p ·(1�ev)

(1�v)+(1�ev) .

LB0(p) = � 2(1� ev)

(1� p)2((1� v) + (1� ev)) 0, 8p

Figure 4.4 shows the upper and lower bounds on b for all valid pairs of p and v with ↵ ! e. The

top plot shows the lower bound on b and the bottom plot depicts the upper bound. For a particular

(p, v) pair, if the color on the top plot is darker than the corresponding color on the bottom plot,

then this pair has a valid b interval in which the reward program is strictly better. This figure also

exhibits the increasing range of b values with increasing p; for large values of p and moderate values

of v, we observe no restrictions on b for the reward program to be strictly better. We combine all

the above observations into the following theorem.

Theorem 4.2.3. Under proportional budgeting, as ↵ ! e, a necessary and su�cient condition for

the reward program to be strictly better is a lowerbound on p which increases with v. And as p

increases beyond the lowerbound, the region of allowable b for which the reward program is strictly

better becomes larger.

Now we generalize the above result for all values of ↵. The conditions are more complex but the

results and intuitions are similar. The proofs are technical, and we leave them to the appendix.

Lemma 9. Fix ↵ 2 (0, e). For any (p, v) pair, there exists some upper bound b1

2 [0, 1] such that

for all b b1

, RoRA � b2

.


Figure 4.4: Bounds on b for various values of p and v at ↵ ! e. Top shows lower bounds on b forRoRA � RoRB and bottom shows upper bounds of b for RoRA � b

2

.

Lemma 10. Fix ↵ 2 (0, e). For any (p, v) pair, there exists some lower bound b0

2 [0, 1] such that

for all b � b0

, RoRA > RoRB.

We combine the above two lemmas as before to get the following theorem.

Theorem 4.2.4. Fix ↵ 2 (0, e). For any value of v, there exists a lowerbound p0

such that for any

p greater than p0

, there exists a range (b0

, b1

) between 0 and 1 such that for all b lying between b0

and b1

, o↵ering the reward program is strictly better for A.

The above results can be extremely helpful in the following way: if a merchant estimates that

the loyalty bias parameter is drawn from a uniform distribution and has good estimates of its target

customer population, i.e., b and p values, it can find the appropriate reward budget ratios ↵, which

could make running a reward program strictly better against a traditional pricing competitor. More

importantly, these results show that under mild assumptions on the customer population parameters,

reward programs can be beneficial in the competitive duopoly model.

4.3 Conclusions

We investigated the optimal design of a frequency reward program against traditional pricing in

a competitive duopoly. We modeled the behavior of customers valuing their utilities in rational

economic terms, and our theoretical results agree with past empirical studies. Assuming general

distributions of customer population, we characterized optimal parameters for the design of reward


program, and under more specific parameter distrubution assumptions, we showed the conditions

on customer population parameters which make the reward program strictly better. In short, if a

merchant can make good estimates of the customer population parameters, our model and results

can help understand the pros and cons of running a frequency reward program for that merchant

against traditional pricing.

Though our research o↵ers some interesting managerial insights, there are some limitations to

our study. Our results on revenue comparisons assumed specific distributions for the customer pop-

ulation, though our framework can be extended to other distributions as well. Moreover, estimating

the customer population distribution and parameters using real transactional data is an interesting

question in itself. That is, backing this research with empirical and experimental study, could pro-

vide strong quantifications to the intuitions we discuss. We modeled customer behavior in rational

economic terms, mainly to understand the rational components that a↵ect the decision making pro-

cess. Tying in the e↵ects of our research with some past models on psychological behavior patterns

of customers toward reward programs would be another practically relevant problem to address. Fi-

nally, we modeled a competitive duopoly, but left the traditional pricing merchant as non-strategic.

Understanding how competition a↵ects the equilibrium prices and reward program parameters could

give intuitions about a more practical scenario.


4.4 Appendix

4.4.1 Proof of Lemma 9

Proof. Proof. We delay the proof of this lemma to first prove a helpful proposition. It is a straight-

forward computation to see that the condition of RoRA � b2

is equivalent to:

1

b

✓1� e� ↵

b↵log

✓1 +

b↵

e� ↵

◆◆� ↵(1� (1� p)(1� ↵v))

2pe(1� ↵v)

() g(b;↵) � h(p, v;↵)

where we have defined functions g(b) and h(p, v) for fixed ↵ for the above inequalities.

Proposition 4. For a fixed ↵, g(b) is non-increasing for all b 2 (0, 1).

Proof. Proof. We take the derivative of g:

g0(b) =2(e� ↵)

b3↵log

✓1 +

b↵

e� ↵

◆� 1

b2� 1

b2⇣1 + b↵

e�↵

⌘ 0

() 2(e� ↵)

b↵log

✓1 +

b↵

e� ↵

◆ 1 +

1

1 + b↵e�↵

() 2 log(1 + x)

x 1 +

1

1 + x

where x = b↵e�↵ , and as b 2 (0, 1), x 2 (0, ↵

e�↵ ). We can see that as x ! 0, the above inequality is

an equality. We represent the LHS of the above equation as L(x) and RHS as R(x). Next we show

that L(x) decreases more quickly than R(x) does for positive x, thereby proving the proposition.

First show that in the range of x the following holds true:

✓2� 1

1 + x

◆2

2 log(1 + x) + 1 (4.10)

To show the above observe that at x ! 0 both the LHS and RHS are equal. And it is easy to show

that the derivative of LHS is lower than the derivative of RHS for all x � 0 as shown.

(1 + x) +1

1 + x� 2

=) 2� 1

1 + x 1 + x

=)✓2� 1

1 + x

◆· 1

1 + x 1

=) 2 ·✓2� 1

1 + x

◆·✓

1

1 + x

◆2

2

1 + x


The left hand side is the derivative of the above LHS and right hand side is the derivative of the

above RHS.

Now we can rearrange Eq. 4.10 as follows:

✓2� 1

1 + x

◆2

2 log(1 + x) + 1

=)✓1 +

x

1 + x

◆2

2 log(1 + x) + 1

=)✓

x

1 + x

◆2

+2x

1 + x 2 log(1 + x)

=) 2

✓x

1 + x� log(1 + x)

◆ �

✓x

1 + x

◆2

=)2⇣

x1+x � log(1 + x)

⌘

x2

�✓

1

1 + x

◆2

The left hand side of above is L0(x) and right hand side is R0(x).

Thus, g(b) is decreasing in b, so for any (p, v) pair, we may compute h(p, v;↵), which will then

fall into one of the following three cases.

• h(p, v;↵) � g(0). So no value of b makes the reward program profitable.

• h(p, v;↵) g(1). So any value of b makes the reward program profitable.

• h(p, v;↵) = g(b0

) for some b0

2 (0, 1). So the reward program is profitable for all b b0

and

not otherwise.

The above proposition and discussion proves our lemma: for fixed ↵ and any (p, v) pair, there is

some upperbound on b s.t. RoRA > b2

.

4.4.2 Proof of Lemma 10

Proof. Proof. Let b↵e�↵ = x. Then RoRA > RoRB can be evaluated as follows:

p e↵

⇣1� log(1+x)

x

⌘(1� ↵v + 1� v)� p(1� v) + (1� p) b

2

(1� ↵v + 1� v) + p(1� v) > 1� v

=) p⇣1� log(1+x)

x

⌘+ (1� p) b↵

2e > ↵e · 1�v

1�↵v+1�v (4.11)

Since ↵ is a constant, the LHS above is a function of b and p. Let the LHS above be L(b, p). We

first show that in the range of b 2 [0, 1], 1� log(1+x)x > b↵

2e which shows that L(b, p) is increasing in

p.


1� log(1 + x)

x>

b↵

2e

,x� log(1 + x) >b2↵2

2e(e� ↵)

Observe that LHS is equal to RHS when b ! 0. All we show is that LHS increases faster than RHS

in the range of b 2 [0, 1].

,✓1� 1

1 + x

◆↵

e� ↵>

b↵2

e(e� ↵)

, 1

1 + x>

e� ↵

e

, e

e� ↵> 1 +

b↵

e� ↵

And the last equation is true in the range of b 2 [0, 1]. Hence L(b, p) increases with p.

Now we show that L(b, p) increases with b as well. First observe:

@L(b, p)

@b= p

✓log(1 + x)� x

1+x

x2

◆↵

e� ↵+ (1� p)

↵

2e

Thus @L(b,p)@b > 0 implies:

(1� p)↵

2e> p

✓ x1+x � log(1 + x)

x2

◆↵

e� ↵

,(1� p)b2↵2

2e(e� ↵)> p

✓1� 1

1 + x� log(1 + x)

◆

Again the LHS and RHS are equal as b ! 0. All we show again is that LHS increases faster as

compared to RHS.

, (1� p)b↵2

e(e� ↵)> p

✓1

(1 + x)2� 1

1 + x

◆↵

e� ↵

Clearly RHS is negative when b 2 (0, 1] and LHS is positive. Hence proved.

Thus L(b, p) is increasing in both b and p. And the condition required is L(b, p) is greater than

some constant value which depends on v. Hence for any v there exists a smooth (b0

, p0

) curve such

that for all b � b0

and p � p0

revenue rate of reward program merchant is larger.

Chapter 5

Future Directions

Loyalty reward programs constitute a large portion of the retail industry. In this thesis, we modeled

three aspects of loyalty reward programs: strategic network formation of coalitions; an alternate

decentralized methodology to settle transactions within coalition loyalty programs; and customer

choice dynamics in a competitive duopoly of a frequency reward program and traditional pricing

merchant. We showed that conducting Nash Bargaining between pairs of merchants is a strong tool

to negotiate the formation of coalitions. Our model for settling credits in coalition loyalty programs

has properties which make it a viable alternative. And we formulate conditions for when it is optimal

for a merchant to o↵er a reward program against traditional pricing.

Though this thesis provides a holistic overview of di↵erent aspects of loyalty reward programs,

there are many limitations pointing toward future work. In Chapter 2, we investigated the negoti-

ation between di↵erent merchants in coalition loyalty programs, but ignored many business aspects

like complementarity of route structures, government regulations, existing business ties, to name a

few. Moreover, we assumed many aspects of customer behavior to be exogenous – for instance, in our

model, customers apriori chose their base merchant, into whose currency they converted currency

earned from other merchants. But this choice could very well depend on the coalition structures

themselves. Also, the pricing of available services at di↵erent merchants heavily depends on the

demand and supply gap, which we did not take into consideration. Endogenizing some of these

aspects into a more holistic model would be an interesting direction for future work. One important

open problem in the credit settlement process we introduced in Chapter 3 is the long term stabil-

ity of the network – i.e., depending on di↵erent transaction regimes initiated by customers, what

exchange rates between merchants could lead to long term transactional liquidity. Additionally,

these transaction regimes could be dynamic themselves, for instance, they could depend on price

fluctuations, and not be exogenously given. In Chapter 4, we discussed how competitive pricing

a↵ects customer behavior toward the reward program, and formulated an optimal design of a stan-

dalone reward program. But competitive pricing would also a↵ect the customer demand for di↵erent

56

CHAPTER 5. FUTURE DIRECTIONS 57

merchant currencies in a coalition loyalty program, which in turn could influence the formation of

coalitions and the settlement process between di↵erent merchant partners. In conclusion, a frame-

work combining all the above mentioned aspects, i.e., to understand customer behavior in coalition

loyalty programs, its e↵ects toward setting up optimal reward schemes and formation of coalitions

with strategic merchants, could provide e↵ective mathematical machinery toward automating some

marketing processes in the retail sector.

Bibliography

Ashton Anderson, Daniel Huttenlocher, Jon Kleinberg, and Jure Leskovec. Steering user behavior

with badges. In Proceedings of the 22Nd International Conference on World Wide Web, WWW

’13, pages 95–106, New York, NY, USA, 2013. ACM. ISBN 978-1-4503-2035-1. doi: 10.1145/

2488388.2488398. URL http://doi.acm.org/10.1145/2488388.2488398.

Venkatesh Bala and Sanjeev Goyal. A noncooperative model of network formation. Econometrica,

68(5):1181–1229, 2000.

Je↵ Berry. The 2013 colloquy loyalty census. COLLOQUY Industry Report, 2013. URL http:

//www.colloquy.com/files/2013-COLLOQUY-Census-Talk-White-Paper.pdf.

Antoni Calvo-Armengol and Matthew O Jackson. Networks in labor markets: Wage and employment

dynamics and inequality. Journal of economic theory, 132(1):27–46, 2007.

Yuheng Cao, Aaron L Nsakanda, Moustapha Diaby, and Michael J Armstrong. Rewards-supply

planning under option contracts in managing coalition loyalty programmes. International Journal

of Production Research, 53(22):6772–6786, 2015.

Michael T Capizzi and Rick Ferguson. Loyalty trends for the twenty-first century. Journal of

Consumer Marketing, 22(2):72–80, 2005.

So Yeon Chun, Dan Andrei Iancu, and Nikolaos Trichakis. Points for peanuts or peanuts for points?

dynamic management of a loyalty program. Dynamic Management of a Loyalty Program (May

19, 2015), 2015.

Erhan Cinlar. Markov renewal theory. Advances in Applied Probability, 1(2):123–187, 1969. ISSN

00018678. URL http://www.jstor.org/stable/1426216.

Pranav Dandekar, Ashish Goel, Ramesh Govindan, and Ian Post. Liquidity in credit networks: A

little trust goes a long way. In Proceedings of the 12th ACM conference on Electronic commerce,

pages 147–156. ACM, 2011.

58

BIBLIOGRAPHY 59

Pranav Dandekar, Ashish Goel, Michael P Wellman, and Bryce Wiedenbeck. Strategic formation of

credit networks. ACM Transactions on Internet Technology (TOIT), 15(1):3, 2015.

Dimitri do B DeFigueiredo and Earl T Barr. Trustdavis: A non-exploitable online reputation system.

In null, pages 274–283. IEEE, 2005.

Xavier Dreze and Joseph C Nunes. Using combined-currency prices to lower consumers perceived

cost. Journal of Marketing Research, 41(1):59–72, 2004.

Peter S Fader and David C Schmittlein. Excess behavioral loyalty for high-share brands: Deviations

from the dirichlet model for repeat purchasing. Journal of Marketing research, pages 478–493,

1993.

Ricardo Flores-Fillol and Rafael Moner-Colonques. Strategic formation of airline alliances. Journal

of Transport Economics and Policy (JTEP), 41(3):427–449, 2007.

Leilei Gao, Yanliu Huang, and Itamar Simonson. The influence of initial possession level on con-

sumers’ adoption of a collection goal: A tipping point e↵ect. Journal of Marketing, 78(6):143–156,

2014.

Arpita Ghosh, Mohammad Mahdian, Daniel M Reeves, David M Pennock, and Ryan Fugger. Mech-

anism design on trust networks. In Internet and Network Economics, pages 257–268. Springer,

2007.

Wesley R Hartmann and V Brian Viard. Do frequency reward programs create switching costs? a

dynamic structural analysis of demand in a reward program. Quantitative Marketing and Eco-

nomics, 6(2):109–137, 2008.

Clark L Hull. The goal-gradient hypothesis and maze learning. Psychological Review, 39(1):25, 1932.

Matthew O Jackson. A survey of network formation models: stability and e�ciency. Group Forma-

tion in Economics: Networks, Clubs, and Coalitions, pages 11–49, 2005.

Matthew O Jackson and Asher Wolinsky. A strategic model of social and economic networks. Journal

of economic theory, 71(1):44–74, 1996.

Dean Karlan, Markus Mobius, Tanya Rosenblat, and Adam Szeidl. Trust and social collateral. The

Quarterly Journal of Economics, 124(3):1307–1361, 2009.

Ran Kivetz, Oleg Urminsky, and Yuhuang Zheng. The goal-gradient hypothesis resurrected: Pur-

chase acceleration, illusionary goal progress, and customer retention. Journal of Marketing Re-

search, 43(1):39–58, 2006.

BIBLIOGRAPHY 60

Paul Klemperer. Competition when consumers have switching costs: An overview with applications

to industrial organization, macroeconomics, and international trade. The review of economic

studies, 62(4):515–539, 1995.

Praveen K. Kopalle and Scott Neslin. The economic viability of frequency reward pro-

grams in a strategic competitive environment. Tuck School of Business at Dart-

mouth Working Paper No. 01-02. Available at SSRN: https://ssrn.com/abstract=265431 or

http://dx.doi.org/10.2139/ssrn.265431, 2001.

Sergio G Lazzarini. The impact of membership in competing alliance constellations: Evidence on the

operational performance of global airlines. Strategic Management Journal, 28(4):345–367, 2007.

Michael Lewis. The influence of loyalty programs and short-term promotions on customer retention.

Journal of marketing research, 41(3):281–292, 2004.

Yuping Liu. The long-term impact of loyalty programs on consumer purchase behavior and loyalty.

Journal of Marketing, 71(4):19–35, 2007.

Paul Resnick and Rahul Sami. Sybilproof transitive trust protocols. In Proceedings of the 10th ACM

conference on Electronic commerce, pages 345–354. ACM, 2009.

Byron Sharp and Anne Sharp. Loyalty programs and their impact on repeat-purchase loyalty pat-

terns. International journal of Research in Marketing, 14(5):473–486, 1997.

Valeria Stourm, Eric T Bradlow, and Peter S Fader. Stockpiling points in linear loyalty programs.

Journal of Marketing Research, 52(2):253–267, 2015.

Gail Ayala Taylor and Scott A Neslin. The current and future sales impact of a retail frequency

reward program. Journal of Retailing, 81(4):293–305, 2005.

J Miguel Villas-Boas. A short survey on switching costs and dynamic competition. International

Journal of Research in Marketing, 32(2):219–222, 2015.

Ovunc Yilmaz, Pelin Pekgun, and Mark Ferguson. Would you like to upgrade to a premium room?

evaluating the benefit of o↵ering standby upgrades (july 11, 2016). Manufacturing Service Oper-

ations Management Forthcoming, 2016.