DESIGN AND ANALYSIS OF LOYALTY REWARD PROGRAMS
A DISSERTATION
SUBMITTED TO THE DEPARTMENT OF MANAGEMENT SCIENCE &
ENGINEERING
AND THE COMMITTEE ON GRADUATE STUDIES
OF STANFORD UNIVERSITY
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
Arpit Amar Goel
March 2017
http://creativecommons.org/licenses/by-nc/3.0/us/
This dissertation is online at: http://purl.stanford.edu/jv609vj6030
© 2017 by Arpit Amar Goel. All Rights Reserved.
Re-distributed by Stanford University under license with the author.
This work is licensed under a Creative Commons Attribution-Noncommercial 3.0 United States License.
ii
I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.
Ashish Goel, Primary Adviser
I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.
Dan Iancu
I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.
Ramesh Johari
Approved for the Stanford University Committee on Graduate Studies.
Patricia J. Gumport, Vice Provost for Graduate Education
This signature page was generated electronically upon submission of this dissertation in electronic format. An original signed hard copy of the signature page is on file inUniversity Archives.
iii
Summary
This thesis provides an in-depth analysis of two major components in the design of loyalty reward
programs. First, we discuss the design of coalition loyalty programs - schemes where customers
can earn and spend reward points across multiple merchant partners. And second, we conduct a
model based comparison of a standalone loyalty reward program against traditional pricing - we
theoretically characterize the conditions under which it is better to run a reward program within a
competitive environment.
Coalition loyalty programs are agreements between merchants allowing their customers to ex-
change reward points from one merchant to another at agreed upon exchange rates. Such exchanges
lead to transfer of liabilities between merchant partners, which need to be frequently settled using
payments. We first conduct an empirical investigation of existing coalitions, and formulate an an-
alytical model of bargaining for merchant partners to agree upon the exchange rate and payment
parameters. We show that our bargaining model produces networks that are close to optimal in
terms of social welfare, in addition to cohering with empirical observations. Then, we introduce
a novel alternate methodology for settling the transferred liabilities between merchants participat-
ing in a coalition. Our model has three interesting properties – it is decentralized, arbitrage-proof,
and fair against market power concentration – which make it a real alternative to how settlements
happen in coalition loyalty programs.
Finally, we investigate the design of an optimal reward program for a merchant competing against
a traditional pricing merchant, for varying customer populations, where customers measure their
utility in rational economic terms. We assume customers are either myopic or strategic, and have
a prior loyalty bias toward the reward program merchant, drawn from a known distribution. We
show that for the reward program to perform better, it is necessary for a minimum fraction of the
customer population to be strategic, and the loyalty bias distribution to be within an optimal range.
This thesis is a useful read for marketers building promotional schemes within retail, researchers
in the field of marketing and behavioral science, and companies investigating the intersection of
customer behavior, loyalty, and virtual currencies.
iv
Acknowledgments
My PhD is a culmination of five years of professional and personal development which would not
have been possible without the support from numerous people who I would like to extend a very
special thanks to.
First, I thank my advisor, Ashish Goel, for his continuous guidance and support. Ashish has
a very strong ability to identify problems that have long term impact. Back in 2012, within our
first few interactions, some of which included key industry leaders, we were able to formulate an
abstract research problem that almost served as my dissertation thesis. Consequently, our research
proposal was not only theoretically relevant, but practical to the industry. In addition, Ashish has
been a really patient and encouraging advisor. The course of PhD is full of ups and downs, and
he made sure to find interesting collaborations for me during the low times, which kept me excited
toward the bigger picture. And most importantly, I learnt from him key interpersonal skills like
clear articulation, communication, and disciplined committment toward finishing goals.
I thank Ramesh Johari and Dan Iancu for serving on my reading committee and taking out the
time to read through my dissertation to o↵er valuable comments; and Professor Warren Hausman
and Itai Ashlagi for serving on my oral defense committee. Teaching is indeed the best way to learn
a subject, and I gained tremendous knowledge across algorithms, optimization, and data science by
o↵ering my services as a teaching assistant to di↵erent classes. I would like to thank professors who
provided me with invaluable teaching experience in the past years: Ramesh Johari, Yinyu Ye, Ashish
Goel, and Tim Roughgarden. And special thanks to Professor Yinyu Ye and Tim Roughgarden for
introducing me to research in my very first year at Stanford. I entered Stanford as a Masters student
back in 2011, and I could not have transitioned into PhD research without the guidance and support
from the two of them.
Stanford is indeed a magical place when it comes to the quality of education alongwith the
diversity of thoughts. I was fortunate to make use of this diversity to a great extent. I thank Matt
Jackson for his course on Social and Network Economics; Serge Plotkin for his Algorithms series
courses which I thoroughly enjoyed; Ben Van Roy for his course series on Dynamic Programming;
Yinyu Ye for his course series on Optimization; and Peter Glynn for Stochastic Processes. All
these classes provided insightful information for my research. In addition, I thank Ann Grimes and
v
Stanford Venture Studio for introducing me to the entrepreneurial ecosystem at Stanford; Andrew
Ng, Chris Manning, and Jure Leskovec for their classes on machine learning, natural language
processing, and data mining. And a very big thanks to my music teachers at Stanford - Timothy
Zerlang for teaching me Piano and Claire Giovannetti for teaching me singing.
My Stanford experience is truly incomplete without the mention of the numerous collaborations,
relationships, and friendships I developed, some of which I hope will last throughout my life. I
thank Ali Dasdan for hosting me at Turn Inc. for the summer in 2012; Gloria Lau and Craig
Martell for hosting me at LinkedIn Corp. for the summer in 2014; Sal Uryasev for being an awesome
mentor for my summer internship at LinkedIn; and postdocs Pranav Dandekar and Sid Banerjee for
helping me learn research fundamentals during my early days. I thank the amazing cohort of my lab
friends - Shayan, Peter, Hong, Hongsek, Camelia, Vijay, Nikhil, Carlos, Nolan - with whom I not
only shared the frustrations during research, but also had many fun conversations around politics,
science, philosophy, history, and religion. And a very big thanks to some of my really close friends -
Bharath, Subodh, Anshul, Raghu, Bobo, Nipun, Sparsh, Aju, Rose, Navneet, Apaar for o↵ering me
support like a family when I needed it the most. Specially Bharath and Subodh for the few amazing
“startup” projects we worked on together, and will continue to work on. A very special thanks to
my best friend Purvi for really taking out the energy to go through every aspect of my thesis and
calm me down to simplify this enormous journey toward the end.
Finally, and most importantly, I thank my family. My parents taught me the principles of
committment and discipline for hard work. My nanny (I call her pappy) played an important role
in my upbringing and I can’t thank her enough for teaching me the simplicity of life. My elder
siblings, Ambika and Shakti, have always guided me throughout my schooling and education. I
owe most of my knowledge to the two of them. Ambika, and my brother in law Anurag, have been
very supportive during the five years of my PhD, and have frequently visited me in times of need.
The past few years have been challenging not just with the PhD research, but because we all went
through a lot as a family. Our ability to stand tall together and support each other during such
times has led me to this milestone.
vi
Contents
Summary iv
Acknowledgments v
1 Introduction 1
2 Network Formation of Coalition Loyalty Programs 6
2.1 Bilateral Negotiation Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Empirical Investigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Model Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.5 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.5.1 Analysis of Social Welfare Gap Example . . . . . . . . . . . . . . . . . . . . . 19
2.5.2 Proof of Proposition 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.5.3 Proof of Theorem 2.3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3 Credit Settlement in Coalition Loyalty Programs 21
3.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.4 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.4.1 Proof of Theorem 3.2.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4 Optimal Design of a Frequency Reward Program 33
4.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.1.1 Customer Behavior Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.1.2 Merchant Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.2.1 Customer Choice Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
vii
4.2.2 Merchant Objective Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.4 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.4.1 Proof of Lemma 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.4.2 Proof of Lemma 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5 Future Directions 56
viii
List of Figures
1.1 Some Examples of Coalition Loyalty Programs. . . . . . . . . . . . . . . . . . . . . . 2
2.1 Extended Partners of Star Alliance Members . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Exchange Rates within the Star Alliance Program . . . . . . . . . . . . . . . . . . . 13
2.3 Partnerships Across Multiple Merchants . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.1 Illustrative Example of Dynamic IOU Settlement . . . . . . . . . . . . . . . . . . . . 22
3.2 Transaction from u to v Leading to Cycle C . . . . . . . . . . . . . . . . . . . . . . . 25
4.1 Rate of revenue for reward merchant as a function of ↵ (with k = e↵(1��) ) for di↵erent
distributions. For all distributions, � = 0.9, p = 0.9 and v varies as labeled. The
uniform distribution is on (0, b] with b = 0.9; the normal distribution has µ = 0.5 and
� = 0.1; and the logit-normal distribution is the standard on [0, 1]. . . . . . . . . . . 44
4.2 Regions where RoRA > RoRB (blue), where RoRA > b2
(yellow) and where both are
true (green) for di↵erent values of ↵. In all cases, � = 0.95, v = 0.05 and � drawn
uniformly on (0, b]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.3 The upper and lower bounds on b as a function of p. Here v = 0.05 and ↵ ! e. . . . 49
4.4 Bounds on b for various values of p and v at ↵ ! e. Top shows lower bounds on b for
RoRA � RoRB and bottom shows upper bounds of b for RoRA � b2
. . . . . . . . . . 51
ix
Chapter 1
Introduction
Loyalty reward programs constitute a huge market in consumer retail, primarily serving as a mech-
anism for customer acquisition and retention (Sharp and Sharp [1997], Taylor and Neslin [2005]).
Over 48 billion dollars in perceived value of rewards is issued in the United States alone every year,
with every household having over 19 loyalty memberships on an average (Berry [2013]). This mar-
ket constitutes global companies like credit cards, hotel and airline reward programs, and even local
merchants like restaurants, grocery, and retail stores.
There are many di↵erent forms of loyalty reward programs. One common form are frequency
reward programs, such as airline frequent flyer programs, wherein customers earn certain number of
points from every purchase. These points can be subsequently redeemed for rewards. Other forms of
loyalty reward programs include punch cards, multi-tier rewards, and cashback rewards. In a punch
card reward program, a customer can avail a reward after making some fixed number of purchases
from the merchant – for instance redeeming one free car wash on getting ten car washes from the
same merchant. In a multi-tier reward program, the perceived value of rewards increases non-linearly
as a customer’s rewards point status with the merchant increases from bronze to silver to gold –
this non-linearity in rewards is what incentivizes members of a program to become more loyal to
the merchant1. Cashback reward programs allow customers to redeem a fixed percentage dollar
cashback based on the total spending (s)he makes with the merchant subject to some minimum
expenditure constraints.
Over time, many stand-alone loyalty programs have agglomerated into larger coalition programs
which allow customers to earn and redeem points across di↵erent merchant partners participating
in the coalition. The observed coalition networks are surprisingly complex, encompassing both
pairwise partnerships like Safeway-Chevron coalition (cf. Fig. 1.1a) as well as centralized coalition
loyalty programs such as Star Alliance (cf. Fig. 1.1b) and OneWorld Alliance (international airline
alliances), Nectar (U.K.), Air Miles (Canada), Payback (Germany), Fly Buys (Australia), etc. Often
1For example, every major airline o↵ers tiered rewards with a premium status on attaining su�cient miles.
1
CHAPTER 1. INTRODUCTION 2
(a) Safeway Chevron
Partnership
(b) Airline Members of Star Al-
liance
(c) Partners of United Airlines
Figure 1.1: Some Examples of Coalition Loyalty Programs.
a merchant is part of multiple such networks leading to coalitions which are complex combinations
of centralized as well as pairwise partnerships. For instance, United Airlines, in addition to being
part of the Star Alliance, has pairwise partnerships with local airlines and merchants (cf. Fig. 1.1c).
These coalition loyalty programs are agreements between merchants allowing their customers
to exchange reward points from one merchant to another at agreed upon exchange rates. For
instance – customers carrying points from Marriot Hotel’s reward program can convert them into
miles from United Airlines at an exchange rate of one to four, i.e., 10,000 Marriot points convert
into 2,500 United miles. Such an exchange, though appears to be unrewarding for the customer,
is often beneficial – the customer might be carrying currency from United Airlines and the above
mentioned exchange might enable an immediate redemption of a free flight from United Airlines.
Usually, a high threshold of points is required to redeem worthwhile rewards from merchants, and
thereby an exchange as mentioned above incentivizes a customer to convert currencies earned at
merchants (s)he visits infrequently into currencies of merchants (s)he visits more frequently. That
is, customers often choose one merchant to accumulate currency with, and convert points they earn
at other merchants into their preferred merchant’s currency. Thus, the frequency of these currency
exchanges depend on which merchants are chosen by customers for currency accumulation, how
frequently customers visit di↵erent merchants, and what di↵erent exchange rates are agreed upon
between pairs of merchants. An interesting thing to note here is that even if two merchants have
no direct partnership, a customer could still convert points between them via a path – for instance
Hawaiian Airlines is not part of Star Alliance but has a partnership with United Airlines which
is a part of Star Alliance (as shown in Fig. 1.1b and Fig. 1.1c). Hence a customer can e↵ectively
convert miles earned on Hawaiian Airlines into miles with any Star Alliance member like Lufthansa
Airlines. A great variety is observed in the exchange rates between di↵erent merchant partners
and such exchanges of points lead to transfer of liabilities which need to be frequently settled.
Setting up these programs requires negotiations between di↵erent participating business entities.
Though coalition loyalty programs are very popular and well studied in the literature (Capizzi and
Ferguson [2005], Cao et al. [2015], Lazzarini [2007]), there is little formal understanding of the
CHAPTER 1. INTRODUCTION 3
structure and strategic formation of such networks, specifically the exchange rates between di↵erent
merchants. In Chapter 2, we introduce a model to understand the inter-partner utilities that arise in
coalition loyalty programs and study strategic network formation of such coalitions. We use pairwise
bargaining between merchant pairs as a tool for negotiating partnerships over the network and show
that this process is e�cient – the network structures obtained theoretically cohere with empirically
observed data, and the social welfare obtained is close to optimum for practically relevant scenarios.
As mentioned above, the currency exchanges initiated by customers within coalition loyalty
programs introduce liabilities between merchants which frequently need to be settled. Merchants
settle these liabilities as follows: they set a mile to dollar value and keep track of the number of
miles owed within a fixed time frame, which is usually one fiscal year. They then settle accounts by
paying each other the respective mile to dollar value times the number of miles owed. This settlement
process is ine�cient and risky. First, it requires a centralized currency like U.S. dollars. Second, as
we discuss in the next chapter, the negotiation of the payment parameters is a complicated process.
And third, merchants often change the speed at which they roll out currency to their customers –
for instance, a customer might receive 50,000 bonus miles from United Airlines sometime during the
year for being a “valuable” customer. This could lead to inflation of that merchant’s currency. But
this change is not reflected in the already decided exchange rate and the mile to dollar value for
settling transferred liabilities, and could thereby lead to tensions between the partenering merchants.
In Chapter 3, we propose an alternate framework for dynamic settlement of transferred liabilities
in a coalition loyalty program: merchants commit to accepting each others’ rewards points up to a
limit at prespecified exchange rates. We refer to these committments as IOUs2. Customers exchange
reward points between merchant pairs along paths of su�cient IOUs. We refer to such exchange of
points as transactions. Transactions are routed via paths of maximum exchange rate to minimize
conversion loss. They are also allowed to occur occur between merchant pairs who do not directly
commit to accepting each others’ currency through trusted intermediaries. Past transactions are
accounted for by introducing reverse IOUs which are e↵ectively promises to allow future transactions
to settle already transferred liabilities. Since the credit limits imposed are in respective currencies
of participating merchants, no central currency like U.S. dollars is required for settlements. Mutual
credit limits between two merchants imposes the following restriction: if the flow of points between
two merchants does not balance out and replenish the credits, the system stops allowing conversion
of points between them along the direction of depleted credit, causing losses to the merchant who is
not able to reverse the transferred liabilities. Additionally, we show two additional properties of the
system. First, transactions never lead to any new arbitrage opportunities in the system, though they
could create new IOUs. And second that the state of the system is independent of the paths chosen
for transactions as long as the paths maximize the exchange rate: i.e., nodes are not incentivized to
demand payments to act as an intermediary for transactions. In short, we introduce a decentralized
2IOU stands for “I Owe You” - Definition from Investopedia: An informal document that acknowledges a debtowed, and this debt does not necessarily involve a monetary value as it can also involve physical products
CHAPTER 1. INTRODUCTION 4
model for settling transactions between merchants participating in a coalition loyalty program, and
show properties of the system which make it a real viable alternative to the credit settlement process.
Finally, in Chapter 4 we investigate the scenarios for when o↵ering a loyalty reward program is
better for a merchant as opposed to traditional pricing. We look into the design of frequency reward
programs, where customers earn points as currency over spendings with merchants, and are able to
redeem these points into dollar valued rewards after attaining some threshold point collections. We
consider a competitive duopoly where one merchant o↵ers a frequency reward program and the other
o↵ers traditional pricing with discounts, and characterize a novel model of customer choice where
customers measure their utilities in rational economic terms. We assume two kinds of customers:
myopic and strategic (Yilmaz et al. [2016]). In addition, we assume that every customer has a
prior loyalty bias (Fader and Schmittlein [1993]) toward the reward program merchant, a parameter
drawn from a known distribution indicating an additional probability of choosing the reward program
merchant over the traditional pricing merchant. This bias increases the switching costs (Klemperer
[1995]) of strategic customers until a tipping point and we show that customer behavior exhibits a
phase transition: (s)he is not incentivized to visit the reward program merchant before attaining
a certain number of points from it, and after that (s)he strictly prefers making purchases from
the reward program merchant. These behavioral patterns cohere with the emprically observed and
psyhologically motivated customer choice dynamics in past literature, thereby validating our model
(Kivetz et al. [2006], Dreze and Nunes [2004], Gao et al. [2014]). Moreover, we assume that the
traditional pricing merchant is non-strategic, and characterize the optimal reward design choice
for the merchant o↵ering the frequency reward program, specifically, how should the merchant
decide the optimal point thresholds and dollar value of rewards to optimize for its revenue share
from the participating customer population. We impose an important constraint: the merchant
has to o↵er a one design fits all reward program for the entire participating customer population,
and is not allowed to personalize the program for di↵erent customer segments. We show that the
optimal reward design parameters for maximizing the expected revenue for the reward program
merchant are independent of the above mentioned aspects of the customer population. We refer to
the reward program as being e↵ective if the revenue objective of the merchant is better than that
of the traditional pricing merchant, and better than the revenue objective it could have earned by
not running any reward program. We show that for the reward program to be e↵ective, a minimum
fraction of the participating customer population needs to be strategic. And correspondingly, there
is an optimal range of the loyalty bias distribution parameter. If the bias is high, the reward program
creates loss in revenues, as customers e↵ectively gain rewards for “free”, whereas a low value of bias
leads to loss in market share to the competing merchant. In short, if a merchant can estimate the
customer population parameters, our framework and results provide theoretical guarantees on the
pros and cons of running a reward program against traditional pricing.
This is how the remaining thesis is structured. In Chapter 2, we study strategic network formation
CHAPTER 1. INTRODUCTION 5
of coalition loyalty programs. In Chapter 3, we formulate a novel model for settling credits in
coalition loyalty program. And in Chapter 4, we investigate the design of optimal frequency reward
programs. Finally we conclude and discuss future directions in Chapter 5.
Chapter 2
Network Formation of Coalition
Loyalty Programs
In this chapter, we look into the network formation problem in coalition loyalty programs. The
exchange of points between di↵erent merchants initiated by customers create additional values and
costs to the merchants: value in the form of attracting new customers, and cost in the form of
losing customers to potential competitors. Such exchanges also transfer point liabilities between
merchant pairs, and hence require mutual settlement via payments. The exchange rates and payment
parameters are negotiated by the merchants through some bargaining process and lead to formation
of networks which are surprisingly complex (cf. Fig. 2.1). We model this as a network bargaining
problem. We show that merchants resorting to pairwise Nash Bargaining produce network structures
identical to those obtained by maximizing social welfare. Moreover, we show that there is no welfare
loss by such pairwise bargaining if there no competition, and the loss is small in a large class of
practically motivated scenarios. This property is attractive since pairwise negotiations are much
more cost-e↵ective and easier to implement than centralized solutions. Our theoretical predictions
cohere with empirical observations we make about such networks.
Modeling Coalition Loyalty Programs: A network of loyalty programs can be viewed as a
weighted directed graph, with nodes corresponding to merchants, and an edge from a merchant A
to merchant B with weight rAB corresponding to an agreement via which customers can convert 1
point issued by merchant B into rAB points issued by merchant A. We henceforth refer to A as the
source node and B as the sink node of this edge1 and points issued by A as A-points and points
issued by B as B-points.
We aim to understand the structure of these coalition programs. We do so by studying a strategic
1These exchange rates are public, and many of them can be collected from the service: http://www.webflyer.com.
6
CHAPTER 2. NETWORK FORMATION OF COALITION LOYALTY PROGRAMS 7
Figure 2.1: Extended Partners of Star Alliance Members: Shown are 4 Star Alliance members(Aeroplan, Avianca, Lufthansa, and Egypt Air) and their partnerships with local airline and hotelchains in addition to mutual partnerships between each other. An edge between two nodes indicatethat the two merchants allow mutual exchange of reward points. The local partners of two di↵erentairlines usually do not form any mutual partnerships.
network formation game. Our model incorporates the following critical aspects of these networks:
• The non-linear nature of rewards, which encourages customers to convert all points into their
‘home program’.
• Presence of an edge (A,B) increases demand for B’s services by A’s loyal customers, thereby
increasing B’s revenues.
• Conversely, A may incur a cost due to lost sales, in particular, if B is a competitor.
• A higher exchange rate rAB leads to higher demand at B from A’s customers, as it earns them
a higher number of A-points. Moreover, if there are multiple (possibly multi-hop) paths between
A and B, then A’s loyal customers will use paths with the highest product of exchange rates to
maximize the number of A-points they receive.
• Loyalty points are also a source of liability for the issuing merchant, as they can be redeemed in
the future by the customers (Chun et al. [2015]). Hence by permitting the formation of the edge
(A,B), A makes a commitment to accept a share of B’s liability, for which B needs to compensate
A.
Finally, a critical operational aspect of these networks is that they are often formed via negotiations
CHAPTER 2. NETWORK FORMATION OF COALITION LOYALTY PROGRAMS 8
between merchants, and the resulting contracts can not be easily modified. Moreover, in the absence
of a central agency, these contracts are usually negotiated bilaterally between merchant pairs, which
has a much lower setup cost than forming a centralized coalition.
Our Contributions: We study strategic network formation of coalition loyalty programs, under
a model which incorporates all the above aspects. In Section 2.1, we present the model, and fully
characterize the Nash Bargaining solution for two merchants. We also show that Nash Bargaining
maximizes the social welfare in the two merchant setting.
In Section 2.2, we conduct an empirical investigation of some existing coalition networks and
observe that even in the presence of a network the direct relationship between two merchants is
very closely associated with the degree of competition between them: higher the competition, lower
the mutual exchange rates. We argue that extending our model to multiple merchants and using
pairwise Nash Bargaining as a network bargaining solution produces networks that are structurally
identical to the ones observed empirically: complete K-partite graphs emerge, where the merchants
within each partition are competitors, but across partitions they do not compete.
Finally, in Section 2.3, we characterize the gap between the optimal social welfare and the
welfare obtained by pairwise Nash Bargaining over the network. We first show via a counterexample
that bilateral Nash Bargaining does not maximize social welfare, thereby indicating that centralized
coordination, though complicated to conduct, is useful in some settings. And, under mild conditions,
we show that the sub-optimality in social welfare under bilateral Nash Bargaining is small. A
particularly interesting case is where the merchants are completely heterogeneous and mutually
non-competing. In this case, it turns out that bilateral Nash Bargaining does maximize the social
welfare, and a complete directed graph emerges as the solution. We extend this result to the case
where merchants are less heterogenous and semi-competitive.
In a nutshell, our results suggest that pairwise Nash Bargaining over the network leads to similar
structures as observed empirically, and achieves optimal or near-optimal welfare, for practically
relevant situations.
Related Work: The management and impact of loyalty programs is well studied (Cao et al.
[2015], Taylor and Neslin [2005]) – however, the literature on coalition loyalty programs is primarily
empirical (Flores-Fillol and Moner-Colonques [2007]). Lazzarini [2007] provides a survey of the
evolving structures of airline coalitions, suggesting a shift from bilateral ties to more connected and
complex structures over time.
Strategic network formation models have been used to study friendships in social networks (Jack-
son and Wolinsky [1996]), labor markets (Calvo-Armengol and Jackson [2007]), etc. – see Jackson
[2005] for an overview of such models. Our work is closest in spirit to the models of directed network
formation with unilateral decision making (Bala and Goyal [2000]).
Our work is also tangentially related to the literature on credit networks (DeFigueiredo and Barr
CHAPTER 2. NETWORK FORMATION OF COALITION LOYALTY PROGRAMS 9
[2005]). In these networks, various authors have studied liquidity (Dandekar et al. [2011]), strategic
formation (Dandekar et al. [2015]), and credit updating (Resnick and Sami [2009]). Loyalty program
networks have features which are quite distinct from credit networks, in particular, the notion of
liability and the presence of exchange rates. To the best of our knowledge, our work is the first
attempt at formalizing the strategic aspects of these networks.
2.1 Bilateral Negotiation Model
We start by considering the case of two merchants, A and B, who both run individual loyalty
programs with their own loyal customer base, and are trying to negotiate a coalition loyalty program.
Let us first understand why A and B may both benefit from having a joint program. Suppose
that A’s loyal customers (whom we henceforth refer to as type-A customers) occasionally want to
avail services from B – arguably they are more likely to go to B for these services if the points that
they earn from B (henceforth, B-points) upon purchase can be converted back to A-points. This
excess demand is clearly beneficial to B – it not only brings in immediate revenues, but possibly
future revenues from type-A customers preferring B over its competitors. Moreover, the likelihood
of type-A customers bringing business to B will in general increase with the exchange rate rAB ,
i.e., the number of A-points earned by exchanging 1 B-point. Hence, higher the exchange rate, the
better it is for B in this respect.
On the other hand, by allowing B-point to A-point conversions, A is taking on B’s liability of
providing value against the points accumulated by a type-A customer using B. The higher the
conversion rate, the higher is the volume of conversions, and thus higher is this liability. Further,
every visit of A’s customer to B results in some lost sales (current/future), in particular, if B is a
competitor. This can be viewed as a cost for A, which is high if A and B are competitors, and low
if they o↵er complementary services. Moreover, it is natural for B to compensate A for taking on
its liability. For similar reasons, A may also benefit if B allows conversions of A-points to B-points.
The exact exchange rates and payments made for each point converted are decided via a negotiation
between A and B.
Without loss of generality we can assume that the points in the two reward programs are nor-
malized, so that both A and B bear a cost of p dollars for each point redeemed by a customer.
Suppose that the exchange rates are rAB and rBA. One concern in any exchange network is the
presence of arbitrage opportunities – to prevent this, we require the product of rAB and rBA to be
not more than 1 (since otherwise customers can increase their points unboundedly for free). In fact,
in most cases, the observed exchange rates between loyalty programs are usually less than 1. Hence
we impose a stronger requirement that all rates lie in [0, 1] 2.
2Qualitatively, our results do not change, if we impose the weaker condition that product of pairwise exchangerates are bounded by 1
CHAPTER 2. NETWORK FORMATION OF COALITION LOYALTY PROGRAMS 10
Next, we assume that the rate at which points are converted from B to A, by type-A customers
buying from B, is an a�ne function of rAB : ✓ABrAB + �AB (and similarly, we define ✓BArBA+ �BA
to be the rate of flow of points in the opposite direction). ✓AB + �AB can be thought of as the rate
at which type-A customers visit B and exchange B-points for A-points under the highest possible
exchange-rate (i.e., 1), while �AB is the rate at which type-A customers visit B when the exchange
rate is 0. Note that this quantity allows us to abstract away the pricing by di↵erent merchants as
well as their schemes to grant points to their customers. For simplicity we let � values to be 0,
although again our qualitative results remain unchanged if we relax this assumption.
We define cAB 2 [0, 1] be the competitiveness between the two merchants: value 1 refers to
the two merchants directly competing and o↵ering substituble services, and value 0 refers to them
o↵ering complementary services. Note that competitiveness is symmetric, i.e., cAB = cBA. Next, let
ai � 0 be the value that merchant i (i 2 {A,B}) obtains per point earned and converted by a type-j
customer (j 2 {B,A}). As mentioned earlier, this value can be attributed to customer acquisition
and viewed as a combination of immediate revenues due to the purchase, as well as future revenues
due to the possibility that the customer may prefer i over its competitors in the future. On the
other hand, we define the cost perceived by i for every point converted from i to j as aj ⇥ cij , i.e., if
the two merchants are direct competitors, the perceived cost of losing a customer by one merchant
is same as the perceived value of gaining the customer by the other merchant. Finally, we define qji
to be the payment made by j to i for every point conversion initiated by a type-i customer from j
to i.
To summarize: for i 2 {A,B}, and j 6= i, we have:
– Rate of points-transfer from j to i: ✓ijrij
– Payment received by i for the above transfer: ✓ijrijqji
– Perceived cost of i for the above transfer: ✓ijrijajcij
– Cost of i for redeeming transferred liabilities: ✓ijrij · rijp– Rate of points-transfer from i to j: ✓jirji
– Value received by i due to the above transfer: ✓jirjiaj
– Payment made by i for the above transfer: ✓jirjiqij
Using the above, the utility derived by i = A,B is:
ui = �✓ijr2
ijp� ✓ijrijajcij + ✓ijrijqji + ✓jirji(ai � qij). (2.1)
And the social welfare, (i.e., the sum of utilities) is:
ui + uj = ✓ijrij(aj � cijaj � rijp) + ✓jirji(ai � cjiai � rjip)
= ✓ijrij(aj(1� cij)� rijp) + ✓jirji(ai(1� cji)� rjip) (2.2)
The quantity aj ⇥ (1 � cij) is the di↵erence between the benefit that merchant j gets, and the
CHAPTER 2. NETWORK FORMATION OF COALITION LOYALTY PROGRAMS 11
perceived cost to merchant i, due to a type-i customer visiting j. This is the additional social welfare
in the two merchant network per transferred liability from j to i. Note that this additional welfare
is maximized when the two merchants are complementary (cij = 0). Intuitively this means that
the two merchants o↵ering complementary services can combine their reward programs by o↵ering
relevant exchange rates to minimize the outflow of their customers to other competitors.
Solution Concept: We use Nash Bargaining as a solution concept to resolve bilateral negotia-
tions between merchants. Under this, the directed exchange rates and the payments are chosen to
maximize the product of the net utilities of the two merchants. The following result characterizes
the Nash Bargaining solution in this setting:
Theorem 2.1.1. Under the Nash Bargaining solution, we have for i, j = A,B,
1. rij = min{aj(1�cij)2p , 1}.
2. qij and qji are any values that satisfy:
✓ijrij(aj � 2qji + ajcij + rijp) = ✓jirji(ai � 2qij + aicji + rjip)
Moreover, these parameters also maximize the social welfare.
Proof. First observe the utility values of both merchants from Eq. 2.1 is:
ui = �✓ijr2
ijp� ✓ijrijajcij + ✓ijrijqji + ✓jirji(ai � qij)
uj = �✓jir2
jip� ✓jirjiaicji + ✓jirjiqij + ✓ijrij(aj � qji)
Threat point is still (0, 0), and hence under Nash Bargaining uiuj is maximized. This implies
that the derivative of uiuj w.r.t. qij and qji is 0 as these parameters are unconstrained. And for
parameters being rij or rji, if the derivative of uiuj is positive within the constraints of the exchange
rates (i.e., [0, 1]), then the value is maximized at the exchange rate equal to 1, otherwise, we find
the exchange rate by setting the derivative to 0. Observe that for x being any of the parameters:
@uiuj
@x= uj
@ui
@x+ ui
@uj
@x= 0
Now we’ll calculate @ui
@x and @uj
@x for each x:
@ui
@qij= �✓jirji
@uj
@qij= ✓jirji
CHAPTER 2. NETWORK FORMATION OF COALITION LOYALTY PROGRAMS 12
The above three equations immediatly imply ui = uj . Now observe the following:
@ui
@rij= ✓ij(qji � ajcij � 2rijp)
@ui
@rji= ✓ji(ai � qij)
@uj
@rji= ✓ji(qij � aicji � 2rjip)
@uj
@rij= ✓ij(aj � qji)
From the above equations we get the following:
@uiuj
@rij= ui(✓ij(aj(1� cij)� 2rijp)) (2.3)
Now since aj(1 � cij) is non negative and the exchange rate rij is bounded by 1 we get the
maximizing rate as min{aj(1�cij)2p , 1}. The same result as above hold for rji. Also, equating ui and
uj , we get the following:
✓ijrij(aj � 2qji + ajcij + rijp) = ✓jirji(ai � 2qij + aicji + rjip)
It is easy to observe that one solution to the above equation is qji = (aj(1 + cij) + rijp)/2 and
qij = (ai(1 + cji) + rjip)/2, and we can substitute the appropriate values of rij and rji for the
di↵erent cases.
Showing that these parameters maximize social welfare is straightforward. First, we observe
from Eq. 2.2 that the social welfare is independent of the payment parameters as they cancel out.
Taking the derivative of the sum of the utilities in Eq 2.2 gives the same values of exchange rates as
obtained above.
Remarks: Intuitively the above result on exchange rates means the following: if the value to
acquiring a customer is negative, then the edge is not formed; if the value is positive then the
exchange rate on the edge is directly proportional to the acquisition value and negatively dependent
on the complementarity, being bounded by the maximum limit of 1.
Also, it is not true in general that the Nash bargaining solution maximizes social welfare, but it
is in our case. To see why, note that the social welfare, which is the sum of the utilities of the two
merchants, is independent of the payments qij . Further for any fixed value of social welfare, these
payments can be designed in such a way that the two utilities are equal to each other, and equal
to half the social welfare. thereby maximizing the product of the utilities for a given social welfare.
CHAPTER 2. NETWORK FORMATION OF COALITION LOYALTY PROGRAMS 13
Thus, in order to maximize the product of the utilities, it is necessary that the sum of the utilities
is maximized.
We presented one possible set of payments in the proof. When the exchange rates are evaluated
to be less than 1, this set of payments reduces to the following: substituting rij = aj(1�cij)2p , and
setting the term inside the settlement equation to 0, gives qji = aj(3�cij
4
) and qij = ai(3�cji
4
).
However, there are other solutions as well. For instance, if ✓ijrijaj = ✓jirjiai, then qij = qji = 0
is a solution as well. In this case, the reciprocal benefits exactly match for the two merchants and
hence no payments are needed. In fact one can show that there is always a solution where one of
qij or qji is 0, i.e., payments are unidirectional.
2.2 Empirical Investigation
Figure 2.2: Exchange Rates within the Star Alliance Program:
First, we look into the case of Star Alliance - a coalition of popular airlines across the world
including Aeroplan, Asiana Airlines, Thai Air, Egypt Air, Indian Airlines, and many others. Star
Alliance acts as a central moderator to set the exchange rates between di↵erent airlines to allow
conversion of miles. Figure 2.2 shows some of these exchange rate relationships - nodes indicate the
CHAPTER 2. NETWORK FORMATION OF COALITION LOYALTY PROGRAMS 14
airline partners and an edge in clockwise direction from a partner A to B indicates that B’s miles
can be converted into A miles at the exchange rate specified on the edge. The airline partners shown
in the graph are mostly non-competitive as they belong to di↵erent countries. We collected this data
by crawling websites of di↵erent airlines. We make few immediate observations - first, the graph
is almost a complete graph (with the exception of Air India and Avianca Airlines) with di↵erent
exchange rates between airline pairs. And second, for any airline, the incoming exchange rates, i.e.,
the exchange rates to convert that airline points into other airlines, are close to each other.
Figure 2.3: Partnerships Across Multiple Merchants: Nodes represent the merchants and an undi-rected edge between two nodes indicates that they mutually allow exchange of points. Data collectedby scraping http://www.webflyer.com. The labels on the nodes are the codeshare merchant ab-breviations – for instance, the sixth node from top left side is US01 which stands for U.S. Airways;the eigth and ninth nodes from the top left are AM and AC which respectively stand for Amtrakand Aeroplan Canada. Similarly one of the nodes in the middle portion is MC which stands forAmerican Express. And the third node from top right is HY 01 which stands for Hyatt Hotels.
Second, we look into a coalition of many more merchant partners, that seems to have emerged
out of bilateral negotiations between merchants. Figure 2.3 shows the exchange rate relationships
between many di↵erent merchants who run their standalone reward programs. These exchange rates
have been obtained by scraping http://www.webflyer.com. This website has a list of publicly
visible exchange rate relationships between merchants. The nodes in Figure 2.3 indicate merchants,
CHAPTER 2. NETWORK FORMATION OF COALITION LOYALTY PROGRAMS 15
and an edge between two nodes indicates that they allow exchange of reward points between each
other. The graph shows three clear partitions: the one on the left are airlines, in the middle are
credit card companies, and one on the right are hotel chains. There seems to be no edges within
partitions and across partitions the graph appears to be almost complete. Nodes within a partition
seem to be competitors, and have no partnerships, whereas nodes across partitions seem to be
complementary. We make the following hypothesis: direct competitors do not form a partnership,
whereas complementary merchants do. But some merchants appear to be outliers in the figure. For
instance, the node labelled as AC, which stands for Aeroplan Canada, is connected to some nodes
on the left partition in addition to being connected to all nodes in the other two partitions. Aeroplan
Canada, being a Canadian airline, does not directly compete with many airlines in the left partition,
as most of them are U.S. based. Thus, the partnerships seem to have some strong relationship with
the complementarity between the merchant pairs. Note that we obtained a similar result using our
model in the previous section via Nash Bargaining between pairs of merchants – the exchange rate
was directly proportional to the complementarity (one minus competitiveness). Hence, extending
our model to multiple merchants and using Nash Bargaining between all pairs of merchants without
considering the outside network would create identical structures to the ones observed empirically.
We extend our model along the lines of the above hypotheses and show that the bargaining
methodology we propose is not too far from the optimal in terms of the overall social welfare.
2.3 Model Extensions
We extend the model to more than two merchants negotiating a joint reward program. This involves
deciding the directed exchange rates between every pair of merchants, and the payments made by
the merchants to each other to compensate for the additional liabilities, just like we saw in the case
of two merchants. But in this case a new feature of the problem arises. Suppose there are three
merchants A, B, and C. If the direct exchange rate between two of them, say A and C, is lower than
the “indirect” exchange rate obtained by first converting C-points into B-points and then converting
those into A-points, i.e., rAC < rAB ⇥ rBC , then the direct exchange rate rAC is rendered defunct,
since no customer will use it to convert points.
Not giving due consideration to this fact during the multi-party negotiation can have severe e↵ect
on the social welfare. For instance, this could happen if all the merchants set their exchange rates
by resorting to pairwise bilateral negotiations, without considering the externality imposed by the
decisions of others. Consider the example below.
Example: Let A and C be two competing airlines that operate between the same cities. We
capture this by having cAC = 1. Let B be a hotel, which naturally does not compete with either
A or C. We capture this by having cAB = cBA = 0 and cCB = cBC = 0. Further suppose that
CHAPTER 2. NETWORK FORMATION OF COALITION LOYALTY PROGRAMS 16
aA = aB = aC = a. Since A and C are direct competitors, and p > 0, there is no additional
cumulative welfare if some agents convert points between them. In this case, value is actually
lost because of services provisioned against the points converted. Hence ideally, the exchange rates
between them should be 0. This may not be possible because of the existence of the indirect exchange
rate through B (since having a joint program with B may be welfare improving). But if A and B,
and C and B undergo bilateral negotiations without considering this e↵ect of their decisions, the
resulting indirect exchange rate between A and C (through B) would typically be higher than the
one that optimizes the welfare. We can show that in this case, the worst case ratio of the optimal
social welfare and the welfare under bilateral Nash Bargaining is 1.58, if the maximum rate of points-
transfer (✓) is the same between every merchant pair (Appendix 2.5.1). In fact, this worst case ratio
holds true even for a large class of similar situations.
Let there be K classes of merchants, such that within each class the merchants are competitors,
while across classes the merchants are non-competing. Assume that aj = a for all merchants. For
any two merchants i and j within a class, suppose that cij = cji = 1. For any two merchants i and
j belonging to di↵erent classes, suppose that cij = cji = 0. And assume that the maximum rate of
points-transfer (✓) is the same between every merchant pair. Then we have the following result:
Proposition 1. For the above mentioned model, the worst case ratio of the optimal social welfare
and the welfare obtained using bilateral Nash Bargaining is 1.58.
We conjecture that 1.58 is still an upper bound to the ratio of welfare values in the more general
model where aj = a, where is the class of the merchant, and for any two merchants i and j
within a class , we have cij = cji = 1. Moreover, in this more general model, we can show that
the social welfare maximizing and pairwise bilateral Nash bargaining solutions lead to structurally
identical graphs: the exchange rates are zero between every merchant pair within the same class,
and they are non-zero across merchants belonging to di↵erent classes. We validate these results with
our empirical observations (cf. Fig. 2.3).
Non-Competing Merchants: We now show that in the case where all merchants are non-
competing, i.e., cij = cji = 0 for all i and j, the above issue does not arise, and somewhat sur-
prisingly, pairwise bilateral Nash Bargaining leads to the social welfare maximizing outcome. The
main point underlying this result is that if the exchange rates are set by bilateral Nash Bargaining
when the merchants are non-competing, then for a customer, the direct exchange rate between any
two merchants is at least as any indirect exchange rate.
Claim 1. Let for any merchant pair, the exchange rates between them be set according to Theo-
rem 2.1.1. Then for any three merchants i, j and k, rik � rij ⇥ rjk.
We leave the proof of the above claim to the appendix (Appendix 2.5.3). Observe that The-
orem 2.1.1 implies that bilateral Nash Bargaining maximizes the sum of the utilities of all the
CHAPTER 2. NETWORK FORMATION OF COALITION LOYALTY PROGRAMS 17
merchants, assuming that only the direct exchange rates will used by the customers (in this case
the social welfare maximization problem decomposes across the di↵erent pairs of merchants). But
if the customers are assumed to use the minimum exchange rate path between any two merchants,
then the maximum sum of utilities of the merchants can only be lower. But the above claim implies
that the solution obtained through bilateral Nash Bargaining naturally has the property that the
direct exchange rates are higher than the indirect ones between every merchant pair. Summarizing,
we thus have the following result:
Theorem 2.3.1. Suppose that merchants are non-competitive. Then exchange rates chosen by
pairwise bilateral Nash Bargaining between di↵erent pairs of merchants maximize the social welfare.
The above result indicates that pairwise negotiations are equivalent to centrally coordinated
optimal solution, and this solution is a complete graph, where the exchange rate depends only on
the destination merchant, when merchants are non-competing. We extend the above result to the
case when the merchants are less heterogenous and semi-competitive.
Semi-Competing Merchants: We now show that even if there is competition between mer-
chants, but the variance in competitiveness is not too high, then under mild conditions the social
welfare is still maximized by using Nash Bargaining between pairs of merchants. Specifically we
define the following two properties:
1. Semi-competitiveness: We say merchants within a network are semi-competitive if for any
three merchants i, j, and k the following holds:
1� cik � (1� cij) · (1� cjk)
Note that the above is equivalent to saying that the variance in competitiveness is not too
high.
2. Low acquisition value: The acquisition value aj for all merchants j is bounded above by twice
the servicing cost 2p. This condition adds some level of homogeneity to all the merchants.
Claim 2. Let all merchants be semi-competitive and have low acquisition values, and the exchange
rates between them be set according to Theorem 2.1.1. Then for any three merchants i, j, and k,
rik � rij ⇥ rjk.
Proof. Between any two merchants i and j, the exchange rate set is rij = min{aj(1�cij)2p , 1}. But
given the conditions, aj(1�cij)2p < 1. Thus rij = aj(1�cij)
2p . Then it is easy to see that for any three
merchants i, j, and k, the following holds:
CHAPTER 2. NETWORK FORMATION OF COALITION LOYALTY PROGRAMS 18
1� cik � (1� cij) · (1� cjk)
=) ak(1� cik)
2p� (1� cij) ·
ak(1� cjk)
2p
=) rik � (1� cij) · rjk
=) rik � aj(1� cij)
2p· rjk
=) rik � rij · rjk
Following the above claim, the Theorem 2.3.1 easily extends to this case as well and we get the
following:
Corollary 2.3.1. Suppose that merchants are semi-competitive and have low acquisition values.
Then exchange rates chosen by pairwise bilateral Nash Bargaining between di↵erent pairs of mer-
chants maximize the social welfare.
2.4 Conclusions
Wemodeled the inter partner utilities that arise in coalition loyalty programs and used Nash Bargain-
ing between merchant pairs as a tool for negotiating the exchange rate and payment parameters. We
completely characterize the two merchant setting and show that the bargained solution has exchange
rates proportional to the complementarity between the merchants and that the solution maximizes
social welfare. Our model and results validate the empirical hypotheses: complete K-partite graphs
emerge where merchants within a partition are competitors and merchants across partitions are
non-competing. Finally, we extend the model to multiple merchants and show that pairwise Nash
Bargaining still achieves optimal or near optimal social welfare: if merchants are non-competing or
semi-competitive, the social welfare obtained is optimal; and if there is competition between some
merchants, then for a wide class of network structures of practical interest, the ratio between the
optimal welfare and the social welfare obtained is bounded above by a small quantity 1.58.
In short, pairwise Nash Bargaining is a powerful tool for conducting negotiations for coalition
formation in loyalty programs.
CHAPTER 2. NETWORK FORMATION OF COALITION LOYALTY PROGRAMS 19
2.5 Appendix
2.5.1 Analysis of Social Welfare Gap Example
Since the maximum rate of points-transfer is the same between every merchant pair, we can ignore
it for calculating the ratio of social welfares. We also assume a 2p. Let us analyze the optimal
social welfare and the social welfare obtained via bilateral Nash Bargaining. First, observe the social
welfare (ignoring ✓):
� r2ABp+ rABa� r2BCp+ rBCa� (rABrBC)2p (2.4)
Nash Bargaining Solution
Under bilateral Nash Bargaining, we get the exchange rates as rAB = rBC = a2p 1 and no edge
from i to k (Theorem 2.1.1). For simplicity let a2p = t. The social welfare in this case is:
�a2
4p + a2
2p � a2
4p + a2
2p � a4
16p3
= a2
2p � a4
16p3
= 2pt2(1� t2/2) (2.5)
Optimal Welfare
We can argue that since there is symmetry, hence rAB = rBC to maximize optimal social welfare.
Let these two exchange rates be r. Then the social welfare is:
2(�r2p+ ra)� r4p = p(�2r2 + 4rt� r4) (2.6)
The above value is maximized at the value of r where the derivative of the function w.r.t. r is 0.
The derivative is p(�4r3 � 4r + 4t). On substituting the root of this derivative as the value of r in
the optimal welfare, and maximizing the ratio of optimal welfare and the Nash Bargaining solution
over t from 0 to 1, we find that the value is maximized at t = 1. The value of r obtained at t = 1 is
around 0.68 and the maximized ratio value is around 1.58.
2.5.2 Proof of Proposition 1
First, we prove the proposition for 2 classes of merchants A and B, and then extending the result to
K classes is fairly straightforward. Let there be m merchants in A and n in B. The proof precedes
very similar to the above simplified example.
Nash Bargaining Solution
Clearly the pariwise bilateral Nash Bargaining solution creates edges between merchant pairs across
A and B with exchange rate r = a2p (Theorem 2.1.1). This is assuming a is no more than 2p. Also
pairwise Nash Bargaining does not create an edge between merchant pairs within any partition. Let
CHAPTER 2. NETWORK FORMATION OF COALITION LOYALTY PROGRAMS 20
a2p = t 1. Consider three merchants A, B, and C, such that A and C are in the same partition
and B is in the other patition, same as the above example. Overall social welfare is just the welfare
obtained by these three merchants times the number of such merchant triplet combinations across
the two partitions. The welfare obtained between these three merchants can be calculated exactly
like above. Thus the value is:
(m(m� 1)n+ n(n� 1)m)⇥ 2pt2(1� t2/2)
= mn(m+ n� 2)⇥ 2pt2(1� t2/2) (2.7)
Optimal Welfare
Again because of symmetry, we can argue that exchange rates between any merchant pairs across the
two partitions are the same. Let this quantity be r. And within any partition, having an exchange
rate greater than r2 will only hurt welfare. And having an exchange rate less than r2 will never
be used. Thus, there are no edges between merchant pairs within any partition. Again like the
preceeding argument, the overall social welfare can be written as:
(m(n� 1)n+ n(n� 1)m)⇥ p(�2r2 + 4rt� r4)
= mn(m+ n� 2)⇥ p(�2r2 + 4rt� r4) (2.8)
It is easy to see that the ratio of welfares is exactly the same as that in the preceeding example,
and hence the maximum value is the same 1.58.
2.5.3 Proof of Theorem 2.3.1
As we argued, the proof of the theorem easily follows after proving Claim 1. We just prove the claim
here.
Proof. We first show that for any three nodes i, j, and k, the following holds:
rik � rij ⇥ rjk (2.9)
Observe that rik = rjk = min{ak
2p , 1} (Theorem 2.1.1). And rij 1 by definition. Hence, the above
equation always holds.
Hence, the exchange rate along any directed edge is the maximum among all paths between those
two merchants. Thus all transactions happen via direct edges, and no transactions happen along
paths of length more than 1. Now the proof of the theorem easily follows.
Chapter 3
Credit Settlement in Coalition
Loyalty Programs
In this chapter we introduce an alternate model for settling transferred liabilities between merchants
participating within a coalition loyalty program.
Overview of our Model: We propose an alternate model for credit settlement in coalition loyalty
programs based on an abstraction of credit networks (Karlan et al. [2009], Ghosh et al. [2007], De-
Figueiredo and Barr [2005]). Credit networks are a versatile abstraction for modeling trust between
entities in a network. In our model, merchants participating in a coalition loyalty program, unilater-
ally commit to accepting each others’ reward points up to a limit (which we call the credit capacity)
at a specified exchange rate. Customers induce exchange of reward points between merchants as
discussed in the previous chapter. We refer to these exchanges of reward points as transactions in
the network of merchants. These transactions happen along paths with positive credit capacities
and maximum exchange rates to minimize conversion loss: i.e. a conversion of some merchant u’s
points into some other merchant v’s points can be facilitated only if there is a path with su�cient
credit capacity from v to u in the network, and the transaction occurs along the path that minimizes
the conversion loss. Suppose a merchant u commits to accepting up to c points issued by a merchant
v at an exchange rate of r. Say some customer transaction induces v to convert x v-points (where
x < c) for u-points; it gets xr u-points in return. The credit extended by u to v depletes by x, and v
promises to redeem x v-points to u in return for xr u-points at a future time. This is represented as
an IOU which can be used to allow for future customer transactions (we use the convention of rep-
resenting credit capacity and exchanges rate along an edge in the currency of the target merchant).
Merchants that do not directly extend credit to each other can still exchange points via trusted
intermediaries. Thus, the credit network acts as a decentralized ledger that obviates the need for a
21
CHAPTER 3. CREDIT SETTLEMENT IN COALITION LOYALTY PROGRAMS 22
central entity to keep accounts.
Hence, transactions change credit capacities, and also introduce credit on reverse-edges to keep
track of IOUs. The exchange rates on reverse-edges are reciprocals of those on forward-edges, and
thus, they could be significantly greater than 1. The state of the system is the set of credit capacities
along all edges including reverse-edges (initially the credit capacity along each reverse edge is 0).
We say that the system is in a no-arbitrage state if the product of exchange rates along any directed
cycle with positive credit capacity is not more than 1.
i
j
k
cij ; rij
cji; rji
cjk; rjk
ckj ; rkj
(a) Before the transaction
i
j
k
(cij � x · rjk); rij
cji; rji
(cjk � x); rjk
ckj ; rkj
(x · rijrjk); 1
rij(x · rjk); 1
rjk
(b) After the transaction
Figure 3.1: Illustrative Example of Dynamic IOU Settlement
Figure 3.1 illustrates our model and how transactions change the state of the system. There are
three merchants: i, j, and k. Merchant j extends a credit of cji i-points to i at an exchange rate of
rji, and cjk k-points to merchant k at an exchange rate of rjk; similarly merchant i extends credit to
j, and k extendts to j. We assume that there is no arbitrage in the starting system, i.e., the product
of exchange rates along any directed cycle is not more than 1. Now suppose that some customer
transaction induces merchant k to convert x k-points into i-points. The maximum exchange rate
path in the network from i to k is chosen for the transaction and k receives x · (rijrjk) i-points. Thecredit extended by j to k depletes by x. Similarly, the credit extended by i to j depletes by x · rjk.IOUs are introduced as shown by dotted edges in Fig. 3.1(b). This means that merchant k promises
to give x k-points to j in return of x ·rjk j-points. Similarly, j promises to give x ·rjk j-points to i in
return of x · (rijrjk) i-points. Say at a future time, another customer transaction induces merchant
j to convert x · rjk j-points to k-points. The maximum exchange rate path is via the dotted green
edge from k to j as the no arbitrage assumption implies 1
rjk� rkj , and thus merchant j receives x
j-points in return.
CHAPTER 3. CREDIT SETTLEMENT IN COALITION LOYALTY PROGRAMS 23
Our Contributions: We formally establish two essential robustness properties of the proposed
framework:
1. No-arbitrage: Allowing dynamic settlement of IOUs by balancing existing IOUs with future
transactions never introduces any arbitrage in the system.
2. Non-concentration of market power : The state of the system is independent of the paths chosen
for the transactions as long as they are routed along the paths with maximum exchange rate.
This implies that no node is incentivized to demand money to act as an intermediary, and thus
does not acquires su�cient ‘market-power’ even if it facilitates many transactions.
These two properties are essential to establish our model as a viable alternative to the credit
settlement process in coalition loyalty programs. This is how the remaining chapter is structured.
We first discuss some related work. And then, in Section 3.1 we formally introduce our model.
In Section 3.2 we show our theory results on no-arbitrage and non-concentration of market power.
Finally, Section 3.3 we conclude with some future work and open problems.
Related Work: Credit networks were originally introduced in thee parallel papers for modeling
trust between entities (DeFigueiredo and Barr [2005], Ghosh et al. [2007], Karlan et al. [2009]).
Credit networks have some immediate structural advantages: they are secure against malicious users,
they allow a decentralized formation leading to an organic growth of the system, and they allow
interactions between entities not directly related to each other. Dandekar et al. [2011] introduced a
decentralized payment infrastructure as an extension of credit networks and in a later paper studied
strategic formation of such networks (Dandekar et al. [2015]). Their model allowed nodes to print
their own currency and issue trust to each other to allow transactions along paths of su�cient trust.
They studied the problem of liquidity in such networks – showing that for various dense graph
topologies, the steady state probability of transactions failing under symmetric transaction regimes
was low. Our model is a direct extension to theirs, but with the addition of exchange rates across
every trust relationship issued between entities.
3.1 Model
We represent merchants by nodes in a directed multigraph G = (V,E). For every edge e = hi, j, riin E, let ce represent the credit capacity along the edge e: i promises to accept up to ce points from
j at the pre specified exchange rate of r along this edge. We represent x points from a merchant j
by the quantity xj . A transaction t is denoted by the tuple hi, j, xi, where i sends xi to j. These
transactions can happen along paths in the network. We refer to a transaction t = hi, j, xi along an
edge e = hj, i, ri as an edge-transaction ht, ei which is feasible only if ce is at least as large as x. If
ht, ei occurs feasibly, i gets in return xr j-points, and the credit capacity on the edge e decreases
CHAPTER 3. CREDIT SETTLEMENT IN COALITION LOYALTY PROGRAMS 24
by x. At a future time, j can request a reverse favor, i.e., exchange xr j-points to get back these
x i-points. This reverse favor is represented by the reverse-edge e = hi, j, 1
r i with credit capacity
xr. We use the notation e to represent the reverse-edge of e. Note that the exchange rates on the
reverse-edges are not visible to customers, but are only there to settle IOUs among the merchants.
Transactions change the credit capacities along edges, and hence change the state of the net-
work. A state is simply the set of credit capacities along all edges in the network, and we de-
note the credit capacity on an edge e in state S as ce(S). The network G can be viewed as
the initial state, which we assume to have no arbitrage. Given a state S of the credit network,
a transaction t = huk, u1
, xi from merchant uk to merchant u1
is feasible along a path P =
{hu1
, u2
, r1
i, hu2
, u3
, r2
i, . . . , huk�1
, uk, rk�1
i} from u1
to uk, if each edge in P has adequate credit
capacity in terms of u-points, i.e., for all ej = huj , uj+1
, rji in P , cej (S) � xuk =⇣x ·
Qk�1
i=j+1
ri
⌘uj+1
.
Routing a feasible transaction t along path P in state S results in a state S0 given by
ce(S0) =
8>><
>>:
ce(S)� x ·Qk�1
i=j+1
ri, if e = ej 2 P
ce(S) + x ·Qk�1
i=j ri, if e = ej 2 P
ce(S), otherwise
(3.1)
Note that, since merchants unilaterally decide to extend credit to each other, there may already
exist an edge e0j = huj+1
, uj , r0ji from uj+1
to uj for all 1 j k�1. Therefore, a credit network may
have up to two edges from some merchant u to another merchant v. Routing a feasible transaction
only a↵ects the total credit extended to the payer (merchant uk) and the payee (merchant u1
); the
total credit extended to all other merchants remains unchanged. Also note that for each edge e and
any two states S and S0 of the network,
ce(S) + ce(S)/re = ce(S0) + ce(S
0)/re (3.2)
For a transaction t = huk, u1
, xi and a path
P = {hu1
, u2
, r1
i, hu2
, u3
, r2
i, . . . , huk�1
, uk, rk�1
i}, we refer to the tuple ht, P i as a path-transaction.
Observe that routing a path-transaction ht, P i is equivalent to routing a sequence of (k � 1) edge-
transactions, htj , eji, 1 j k � 1, where tj := huj+1
, uj , x ·Qk�1
i=j+1
rii, and ej = huj , uj+1
, rji.For a path P from a merchant u to a merchant v in the state S, we define cP (S) := sup{x : x >
0, t = hv, u, xi is feasible along P} as the capacity of path P in state S. cP (S) is the maximum
payment in v-points that can be routed along P in S. We use Puv(S) to denote the set of paths
from u to v in state S and P to denote the reverse path of path P . We define the exchange rate
along P as rP :=Q
e2P re, where re is the exchange rate along edge e in P .
We route transactions along maximum exchange rate paths. More specifically, a transaction
t = hv, u, xi in state S is routed according to the following recursive procedure: Consider a maximum
exchange rate path P ⇤ in Puv(S). If x cP⇤(S), we route the path transaction ht, P ⇤i and we
are done. Otherwise, we route t⇤ := hv, u, cP⇤(S)i along the path P ⇤ resulting in state S0. Let
CHAPTER 3. CREDIT SETTLEMENT IN COALITION LOYALTY PROGRAMS 25
x0 := x � cP⇤(S). We try to route the residual transaction t0 := hv, u, x0i in state S0 recursively. If
any of the successive residual transactions fail, we rollback all transactions to restore state S.
3.2 Results
Theorem 3.2.1. (No Arbitrage): Starting from a no-arbitrage state S, all states reachable through
feasible transactions are arbitrage-free.
Proof. We prove this using contradiction. Assume, at some point of time, a transaction creates an
arbitrage, i.e., a cycle C along which product of exchange rates is greater than 1. Let this transaction
Figure 3.2: Transaction from u to v Leading to Cycle C
route along a path P . Since this transaction creates the arbitrage-cycle C, it must interact with
the cycle C, i.e., C \ P 6= �. Consider the last edge e = hu, v, ri 2 C that gets changed due to this
transaction. Then C would look like as shown in figure 3.2(a). That is there are two paths from
node u to v, one is the edge e and the other with overall exchange rate R. The existence of this
path is guaranteed by the formation of the directed cycle C after the transaction is routed. Since
the transaction chooses the edge e instead of the other path from u to v, we get r � R. Thus after
the transaction is routed, e is created, if it did not exist previously, and we get the directed cycle
C as shown in Figure 3.2(b). The overall exchange rate along this cycle C is R · 1
r = Rr . Now since
this is an arbitrage cycle, we get Rr > 1 =) R > r which is a contradiction. ⇤
Corollary 3.2.1. Any state reachable from a no-arbitrage state S through feasible transactions can
have at most three edges between any two merchants u and v; two from u to v, and 1 from v to u,
or vice versa.
Proof. We prove this by contradiction. Assume there exist 4 (the maximum possible) exchange rates
between u and v. Then the possible exchange rates would be r1
, 1
r2from u to v, and r
2
, 1
r1from v
to u (r1
6= r2
, r1
6= 1
r2). Clearly either r
1
· r2
or 1
r1·r2 is greater than 1, contradicting Theorem 3.2.1.
⇤
CHAPTER 3. CREDIT SETTLEMENT IN COALITION LOYALTY PROGRAMS 26
Definition 1. We call a cycle C as ‘loss free’ if the product of exchange rates along C is 1. We say
a state S0 is reachable via loss free cycles from a state S if there exist loss free cycles with appropriate
transaction amounts that lead to the transition, and we denote this as SlC=) S0.
We define state equivalence under this notion of reachability through loss free cycles and represent
equivalence of two states S and S0 as SE= S0. Thus reachability via loss free cycles partitions the
set of states into equivalence classes. It is easy to see that if any state in an equivalence class
is arbitrage-free then all states in that equivalence class are arbitrage-free, as all the states in
the class are reachable to each other via loss free cycles. We call such an equivalence class as
arbitrage-free equivalence class. We use this as a notion of equivalence and show Theorem 3.2.3 on
path-independence. We set up some more notations first.
Definition 2. For any node u in state S we denote NOUTu (S) as the out-neighbors of u, N IN
u (S)
as the in-neighbors of u, EOUTu (S) as the out-edges of u, and EIN
u (S) as the in-edges of u.
Definition 3. We establish the following notation for a node u in state S:
dOUTu (S) =
Pe2EOUT
u (S)
ce · re: total amount of credit u vests on its neighbors in u’s currency.
dINu (S) =P
e2EINu (S)
ce: total amount of credit neighbors of u vest on it in u’s currency.
dOUT (S) and dIN (S) denote the the vector of dOUTu (S) and dINu (S) for all nodes u in the graph and
are equivalent definitions for Generalized Score Vector as in Dandekar et al. [2011]. We refer to both
dOUT (S) and dIN (S) as d(S) or d-vectors for a state S. The following observations (Obs. 1 to 4)
follow easily.
Observation 1. A ht, P i path transaction does not change the d-vectors of any nodes in the graph
except the source and sink of P .
Observation 2. A ht, Ci transaction along a loss free cycle C does not change the d-vectors of any
nodes in the graph.
Observation 3. If a state S transitions to S0 via a transaction t = hv, u, xi along an edge e =
hu, v, ri, then e = hv, u, 1/ri, ce(S0) � xr, and 1/r = argmax{r(P ) : P 2 Pvu(S0)}.
Observation 4. If a state S0 is reachable from a state S via a transaction t along a loss free cycle
C, then S is reachable from S0 via the transaction t along the reverse loss free cycle C.
Next we show the equivalence theorem to characterize a simpler condition for state equivalence.
This is a technical theorem and helps us with the proof of Theorem 3.2.3, not o↵ering any direct
insights. We leave the proof to the appendix.
Theorem 3.2.2. For any two arbitrage-free states S and S0 of the network the following statements
are equivalent
CHAPTER 3. CREDIT SETTLEMENT IN COALITION LOYALTY PROGRAMS 27
1. S and S0 belong to the same equivalence class.
2. S0 is reachable from S via a combination of loss free cycles.
3. d-vectors are same along each node in S and S0
Theorem 3.2.3. (Path Independence): Starting from any state in an arbitrage free equivalence
class, irrespective of which paths are chosen for a set of successive transactions, as long as the paths
are exchange rate maximizing, the final equivalence class reached is the same.
Proof. We first show Lemma 1, that transactions induce transitions across equivalence classes: ir-
respective of the choice of state in a fixed equivalence class E1
, any feasible transaction always
transitions the state into the same equivalence class E2
. Now we use a simple inductive argument
over Lemma 1 to prove the theorem. Let some set of k transactions in a state S belonging to an
equivalence class E be feasible along two sets of maxmimum exchange rate paths {P1
, . . . , Pk} and
{P 01
, . . . , P 0k}. Let the intermediate equivalence classes reached in the two cases be {E
1
, . . . , Ek} and
{E01
, . . . , E0k} respectively. Applying Lemma 1 once shows E
1
= E01
. Applying it recursively we get
Ek = E0k.
Lemma 1. Fix two arbitrage-free states S and S0 of the same network which belong to the same
equivalence class. Let t = hv, u, xi be a transaction. If t transitions S to S and S0 to S0, then SE= S0.
Proof. Let P1
= argmin{rP : P 2 Puv(S)} and P2
= argmin{rP : P 2 Puv(S0)}. We first show
that rP1 = rP2 . Since SE= S0 we get from Theorem 3.2.2 that S
lC=) S0. We show the following
proposition.
Proposition 2. If S0 is reachable from S via a single transaction along a loss free cycle C then
rP1 = rP2 .
Proof. We prove this using contradiction. Assume rP1 6= rP2 . Assume w.l.o.g. rP1 > rP2 . First
observe that C \P1
6= � as otherwise cP1(S0) > 0 which contradicts the maximality of exchange rate
along P2
in S0. Let u2
be the first node of P1
along C and v2
be the last node of P1
along C. Let
P be the sub-path of P1
from u2
to v2
. Let P 0 be the sub-path of C from u2
to v2
. Then rP � rP 0
as P1
is the maximum exchange rate path in S. Let P 00 be the sub-path of C from v2
to u2
. Then
1 = rC = rP 0 · rP 00 rP · rP 00 (3.3)
But P+P 00 is also a cycle, and the no-arbitrage theorem (Theorem 3.2.1) immediately gives rP ·rP 00 1. This implies that in Eq. 3.3 equality holds throughout. Thus 1 = rP · rP 00 = rP 0 · rP 00 which
implies rP = rP 0 = 1
rP 00. In S0, the reverse loss free cycle C exists as S0 is reachable from S via C
(Obs. 4). Thus P 00, i.e., the reverse path of P 00, is a path in S0 from u2
to v2
with exchange rate
r = 1
rP 00= rP . Now it is easy to observe that P
3
= P1
\P + P 00 2 Pu1v1(S0) and rP3 =
rP1rP
· r = rP1 .
CHAPTER 3. CREDIT SETTLEMENT IN COALITION LOYALTY PROGRAMS 28
Since rP1 > rP2 we get rP3 > rP2 which contradicts the maximality of exchange rate along P2
in S0.
⇤
Now let S0 be reachable from S via single transactions along k loss free cycles {C1
, . . . , Ck} with
intermediary states {S1
, . . . , Sk�1
}. Let P 0i = argmin{rP : P 2 Puv(Si)}. Then Prop. 2 implies
rP1 = rP 01= rP 0
2= . . . = rP 0
k�1= rP2 . Hence we have shown rP1 = rP2 . Now we use this fact to
show the following claim.
Claim 3. If x min{cP1(S), cP2(S0)}, then S
E= S0.
Proof. In S, P1
is a valid choice of path for the transaction t, and since cP1(S) � x, thus it is
also su�cient. Let S1
be the state reached by transacting t along P1
in S. If t goes through
any other choice of paths, all of them would have same exchange rate as P1
due to maximality of
exchange rate along P1
in S. Thus, in S, loss free cycles will be formed for every path other than
P1
used. Then it is easy to see that SlC=) S
1
, which implies SE= S
1
. Similarly, S0 is equivalent
to a state S2
reached by a ht, P2
i transaction in S0. We will show d(S1
) = d(S2
) which would
imply SE= S0 by Theorem 3.2.2 and associativity of equivalence. First, since the d-vectors are
not changed for intermediate nodes between the source and sink (Obs. 1), we get for all nodes w
except u and v, dw(S1
) = dw(S2
). Now dINv (S1
) = dINv (S) � x = dINv (S0) � x = dINv (S2
). And
dOUTv (S
1
) = dOUTv (S)+x = dOUT
v (S0)+x = dOUTv (S
2
). Thus dv(S1
) = dv(S2
). And since rP1 = rP2 ,
we get dINu (S1
) = dINu (S)+x·rP1 = dINu (S)+x·rP2 = dINu (S2
). And dOUTu (S
1
) = dOUTu (S)�x·rP1 =
dOUTu (S)� x · rP2 = dOUT
u (S2
). Thus d(S1
) = d(S2
). ⇤
If x > min{cP1(S), cP2(S0)}, let x
1
= min{cP1(S), cP2(S0)}. Let t0
1
= hv, u, x1
i. Let routing t01
in
S and S0 bring the network to states S1
and S01
respectively. Then the above claim shows S1
E= S0
1
.
Let t1
= hv, u, x � x1
i. Thus we have strictly reduced the transaction amount (x) by routing a
smaller transaction t01
, and the states reached are equivalent to each other. Thus we end with the
same starting condition of state equivalence, and we can continue this procedure. Hence a simple
induction shows that t will be completely routed in a finite number of steps according to the above
procedure and SE= S0.
3.3 Conclusions
We extend the model introduced by Dandekar et al. [2011] to allow arbitrary exchange rates between
entities and proposed an alternative system for settling credits between members of a coalition loyalty
program using it. Our system is decentralized, does not need any centralized currency for settlements,
and thereby, is easy to grow and distribute. The system allows transactions betweem merchants
who may not have a direct partnership. Additionally, it is secure against malicious merchants, as
any malicious merchant can only create risks to its direct partners only. We showed two essential
CHAPTER 3. CREDIT SETTLEMENT IN COALITION LOYALTY PROGRAMS 29
properties that make our model a viable alternative: first, we show that the system is secure against
the creation of arbitrage opportunities, and second, that intermediaries to transactions do not have
incentives to demand payments for transactions through them, as long as alternate paths with the
same exchange rates exist. A critical question here is whether the IOUs in credit networks can always
settle via future transactions? Does the system ensure su�cient liquidity for transactions under some
model of customer transactions? Dandekar et al. [2011] showed liquidity under a simple transaction
regime for the setting with unit exchange rates between nodes. Extending it to arbitrary exchange
rates can provide useful insights on how customer transactions influence the need of renegotiation
of exchange rates within coalition loyalty programs. And what exchange rates, depending on the
frequency of customer transactions, lead to long term liquidity and stability of the network.
3.4 Appendix
3.4.1 Proof of Theorem 3.2.2
We show that for any two states S and S0 the following holds:
SlC=) S0 () d(S) = d(S0)
One side of this relation is straightforward, SlC=) S0 =) d(S) = d(S0) as transactions along loss
free cycles do not change the d-vectors (Obs. 2). To prove the reverse side we first introduce some
more notation and show two lemmas, Lemma 2 and Lemma 3. Then the result follows easily. We
will use Eq. 3.2 a number of times in the proof, so we restate it here.
ce(S) + ce(S)/re = ce(S0) + ce(S
0)/re (3.4)
Definition 4. For two states S and S0 of the network, we denote G0(S, S) as the di↵erence graph
of S and S0 with edge capacities defined as follows:
ce(G0(S, S0)) =
8<
:ce(S)� ce(S0), if ce(S)� ce(S0) > 0
0, otherwise
Let S0 be reached from S via n edge-transactions {t1
, .., tn} transitioning through intermediate
states {S1
, .., Sn�1
}. Let G0(Si) = G0(S, Si) for all 1 i n � 1 and G0(S0) = G0(S, S0). The
following observations follow easily from the definition.
Observation 5. Let S and S0 be two states of the same network. For any edge e if ce(G0(S, S0)) > 0
then ce(S0) > 0.
CHAPTER 3. CREDIT SETTLEMENT IN COALITION LOYALTY PROGRAMS 30
Observation 6. If ti (i n) is the last transaction routed along an edge e, then ce(G0(Si)) �ce(G0(S0)).
Lemma 2. dOUTu (S0) = dOUT
u (S) + dINu (G0(S0))� dOUTu (G0(S0)) for all u 2 G0(S0).
Proof. Using Eq. 3.4 and the definition of G0 it is easy to verify the following:
ce(S0) =
8<
:ce(S)� ce(G0(S0)), if ce(G0(S0)) > 0
ce(S) + ce(G0(S0))/re, otherwise
Thus
dOUTu (S0) =
X
e2EOUTu (S)
ce(S0)⇥ re
=X
e2EOUTu (S);ce(G0
(S0))>0
(ce(S)� ce(G0(S0)))⇥ re +
X
e2EOUTu (S);ce(G0
(S0))=0
(ce(S) + ce(G0(S0))/re)⇥ re
=X
e2EOUTu (S)
ce(S)⇥ re +X
e2EOUTu (S);ce(G0
(S0))=0
ce(G0(S0))�
X
e2EOUTu (S);ce(G0
(S0))>0
ce(G0(S0))⇥ re
=X
e2EOUTu (S)
ce(S)⇥ re +X
e2EINu (G0
(S0))
ce(G0(S0))�
X
e2EOUTu (G0
(S0))
ce(G0(S0))⇥ re
= dOUTu (S) + dINu (G0(S0))� dOUT
u (G0(S0))
Lemma 3. For any pair of nodes u, v there cannot exist 2 edges from u to v and 1 path from v to
u simultaneously in G0(S0) with non zero credit capacity.
Proof. Assume contrary, i.e., there exist distinct edges e1
, e2
from u to v and a path P from v to u
with ce1(G0(S0)) > 0, ce2(G
0(S0)) > 0, cP (G0(S0)) > 0. Assume w.l.o.g. re1 > re2 . Since ce1(G0(S0)),
ce2(G0(S0)), and cP (G0(S0)) > 0, some transactions must be routed along e
1
, e2
, and along each
edge of P . Let ti(i n) be the last edge-transaction along e1
, e2
, or some edge on P , and the state
where this transaction takes place is Si�1
. We first show the following proposition:
Proposition 3. Let P, P 0 2 Puv(S0) be two edge disjoint paths and rP > rP 0 . Then cP 0(G0(S0)) > 0
implies cP (S0) = 0.
Proof. We prove this by contradiction. Assume cP (S0) > 0. Now cP 0(G0(S0)) > 0 implies that some
of the transactions must have routed along edges in P 0. Let ti be the last transaction routed along
P 0 and let ti route via some edge e1
2 P 0. Then ce0(G0(Si�1
)) � ce0(G0(S0)) > 0 for all e0 2 P 0 \{e1
}(Obs. 6). Obs. 5 implies ce0(Si�1
) > 0 for all e0 2 P 0 \ {e1
}. Since the ti transaction routes via edge
e1
2 P 0, thus e1
is the cheapest path for ti. Now we state and prove a helpful claim.
CHAPTER 3. CREDIT SETTLEMENT IN COALITION LOYALTY PROGRAMS 31
Claim 4. Let S be a state, and P, P 0 2 Pu1v1(S) be two edge disjoint paths with rP > rP 0 . Let
e = hu2
, v2
, ri 2 P 0 be some edge and t = hv2
, u2
, xi be some transaction. If e is the cheapest path
for t and ce0(S) > 0 for all e0 2 P 0 \ {e}, then cP (S) = 0.
Proof. We prove this by contradiction. Assume cP (S) > 0. Let P1
be the sub-path of P 0 from u2
to u1
, and, P2
be the sub-path of P 0 from v1
to v2
. Since ce0(S) > 0 for all e0 2 P 0 \ {e}, thuscP1(S) > 0 and cP2(S) > 0. Thus there exists a path from u
2
to v2
along P1
followed by P followed
by P2
. Exchange rate along this path is rP1 ⇥ rP ⇥ rP2 = rP⇥rerP 0
> re. Thus this path is cheaper
than e for t. Hence, we have a contradiction.
Continuing onto the proof of our proposition, the above claim implies cP (Si�1
) = 0. Now since
cP (S0) > 0, there must be a transaction after ti that routes along some edge on P . Let tj be the last
such transaction, and let it route along an edge e2
2 P . Thus we get ce0(G0(Sj�1
)) � ce0(G0(S0)) > 0
for all e0 2 P \{e2
} (Obs. 6). Obs. 5 implies ce0(Sj�1
) > 0 for all e0 2 P \{e2
}. Since the tj transactionroutes via edge e
2
2 P , the above claim implies cP 0(Sj) = 0.
Since cP 0(S0) > 0, there must be a transaction after tj that routes along some edge on P 0. But
ti was assumed as the last transaction along any edge on P 0, and tj occured after ti, hence we have
a contradiction.
Continuing onto the proof of the lemma, we consider the following three cases:
1. ti is along e1
implies ce1(Si�1
) > 0.
ce2(G0) > 0 =) ce2(G
0(Si�1
)) > 0 (Obs. 6).
Since re1 > re2 and both e1
and e2
are from u to v, we get a contradiction from Prop. 3.
2. ti is along e2
implies ce2(Si�1
) > 0.
cP (G0) > 0 =) cP (G0(Si�1
)) > 0 (Obs. 6).
cP (G0(Si�1
)) > 0 =) c¯P (Si�1
) > 0 (Obs. 5)
Since e2
is used for the tth transaction thus re2 > r¯P . Again this contradicts Prop. 3.
3. ti is routed along an edge e0 = hu0, v0, r0i 2 P implies ce0(Si�1
) > 0. For any edge e = e2
or
e 2 P \ e0 we get
ce(G0) > 0 =) ce(G
0(Si�1
)) > 0 (Obs. 6) =) ce(Si�1
) > 0 (Obs. 5)
Let P = P \ {e0}+ e2
. P is a path from u0 to v0 and cP (Si�1
) > 0 (from above observation).
Since e0 is used for the ith transaction we get r(e0) > r(P ). And since ce0(G0(Si�1
)) > 0 and
cP (Si�1
) > 0, we again get a contradiction for Prop. 3.
CHAPTER 3. CREDIT SETTLEMENT IN COALITION LOYALTY PROGRAMS 32
Now we continue onto the proof of the main theorem. We show that G0(S0) can be decomposed
into a combination of loss free cycles which immediately implies SlC=) S0 given that the d-vectors
are same in both S and S0.
For the sake of notational convenice, we denote G0(S0) by G0. First observe that since G0 is the
di↵erence graph, thus dOUTu (G0) = dINu (G0) for all u 2 G0. Lemma 3 clearly applies. We prove this
by contradiction. Assume G0 cannot be decomposed into a combination of loss free cycles. Let G00
be the graph obtained after removing all possible loss free cycles from G0. Observe removing loss free
cycles preserves Lemma 2. Also removal of loss free cycles leads to equal change in dOUT and dIN ,
thus, dOUTu (G00) = dINu (G00) for all u 2 G00. Observe that any source or a sink node cannot satisfy this
property, hence, G00 cannot contain any source or a sink node. Let P be a maximal set of contiguous
edges, each having non zero credit capacity along the same direction in G00. P must be a cycle, since
G00 cannot have any source or sink nodes. Also Lemma 3 shows that there cannot be two edges
between any two nodes u, v on P , since, along the cycle, there already exists a path from v to u. Now
any node along P does not have any incoming or outgoing edges except those along P , as otherwise
it contradicts the maximality of P . Thus we get for each edge e = hu, v, rei 2 P , dOUTu (G00) = ce⇥re
and dINv (G00) = ce. Thus dOUTu (G00)/re = dINv (G00) for each e = hu, v, rei 2 P . Since dOUT
u (G00) =
dINu (G00) for all u 2 G00, we get dOUTu (G00)/re = dOUT
v (G00) for each e = hu, v, rei 2 P . Taking
product over all edges in P we get for any node u 2 P , dOUTu (G00) = dOUT
u (G00) ·Q
e2P re =)Q
e2P re = 1, which means P is loss free, which is a contradiction. This completes our proof of the
equivalence theorem.
Chapter 4
Optimal Design of a Frequency
Reward Program
In this chapter we investigate the problem of designing a frequency reward program for a merchant
against a traditional pricing merchant. There is extant literature on characterizing customer behav-
ior toward frequency reward programs. Most of the literature is empirical in nature, and relies on
psychological behavioral patterns among customers, as opposed to rational economic decision mak-
ing. We consider a competitive duopoly of two merchants where one merchant o↵ers a loyalty reward
program and the other o↵ers traditional pricing with discounts and characterize a novel model of
customer choice where customers measure their utilities in rational economic terms. In addition,
we characterize the optimal reward design choice for the merchant o↵ering the frequency reward
program, based on di↵erent customer populations: specifically, how should the merchant decide the
optimal reward redemption thresholds and dollar value of rewards to optimize for its revenue share
from the participating customer population.
This is how the remaining of the chapter is structured. First we will describe some past work.
Then we will go over our contributions and explain how our work builds on top of past literature.
In Section 4.1 we will describe our model followed by the main results in Section 4.2. We will follow
up with a short discussion and future work in Section 4.3.
Related Work: Three popular psychological constructs have been used to explain customer choice
dynamics toward reward programs – Goal Gradient Hypothesis, Medium Maximization, and Tipping
Point Dynamics. Kivetz et al. [2006] conducted an empirical study observing an acceleration in the
number of purchases by customers as they approached the reward, i.e., as customers accumulated
reward points to reach closer to achieving the reward, their e↵ort invested toward gaining more
points increased. The authors attributed this behavior to Goal Gradient Hypothesis (Hull [1932]).
33
CHAPTER 4. OPTIMAL DESIGN OF A FREQUENCY REWARD PROGRAM 34
This behavior is also very prevalent in online badge systems, such as those on Stackoverflow; re-
cently, mathematical models relying on rational user behavior have been developed that explain this
phenomenon (Anderson et al. [2013]). Stourm et al. [2015], Dreze and Nunes [2004] observed that
customers often stockpiled reward points even when there were economic incentives against the col-
lection of points. They attributed this behavior to Medium Maximization – customers often treated
collecting reward points as a goal itself just like collecting stamps as opposed to connecting reward
points with economic incentives. Correspondingly, they introduced a new model where customers
had di↵erent “mental accounts” and utility functions for points and cash. Gao et al. [2014] observed
via experimentation that customers often collect reward points for exogenous reasons until they
accumulate a threshold amount, after which they start investing e↵ort toward the collection process
itself. That is, customers build up switching costs (Klemperer [1995]) before fully adopting the
reward program, and sometimes this switching cost is created due to reasons exogenous to rational
economic incentives. They referred to this behavior as the Tipping Point E↵ect.
A large body of literature investigates the switching costs customers face within a competitive
duopoly framework – see Villas-Boas [2015] for a short survey. Our model is closest in spirit to
that of Hartmann and Viard [2008] and Kopalle and Neslin [2001]. Both papers are empirical in
nature and model a competitive duopoly where customers maximize their long term discounted
utility. Hartmann and Viard [2008] argue that less frequent buyers face higher switching costs as
they are more likely to be a↵ected by reward redemption deadlines, whereas frequent buyers redeem
rewards easily and do not face substantial switching costs. They do not model how customers build
up switching costs, but only argue what happens when customers are close to achieving a reward.
Kopalle and Neslin [2001] discuss dynamic competition between two merchants deciding whether
to o↵er a reward program or traditional pricing and model this decision problem as a two stage
game: first merchants decide whether to o↵er a reward program or traditional pricing and then
they decide their prices. Using simulations, depending on customer parameters in the model, they
characterize the conditions for when it is better to o↵er a reward program versus traditional pricing.
We on the other hand model a multi-period problem where the customer behavior is characterized
using a complete dynamic program, and mathematically analyze our model. We make two modeling
assumptions: first is an exogenous visit probability bias toward the reward program merchant which
can be attributed to excess loyalty – customers often build up higher brand preference toward
the merchant o↵ering a reward program (Fader and Schmittlein [1993], Sharp and Sharp [1997]);
and second, a look-ahead factor for customers, which indicates how far into the future customers
can perceive the rewards (Liu [2007], Lewis [2004]). Our results on customer choice dynamics
intuitively look similar to some of those obtained in the above mentioned body of literature. But
more importantly, we model and optimize the revenue objective of the merchant, characterizing an
optimal reward program design for maximizing expected revenue.
Our Contributions:
CHAPTER 4. OPTIMAL DESIGN OF A FREQUENCY REWARD PROGRAM 35
Model Overview
We model a competitive duopoly of two merchants, one of them o↵ering a frequency reward program
and the other o↵ering traditional pricing. Both merchants sell an identical good at fixed precom-
mitted prices. The reward program merchant sells the good at a higher price. With each purchase
from the reward program merchant, a customer gains some fixed number of points, and on achieving
the reward redemption threshold, (s)he immediately gains the reward value as a dollar cashback.
Customers measure their utilities in rational economic terms, i.e., they make their purchase
decisions to maximize long term discounted rewards. The discount factor is the time value of
money, and we assume it to be constant for all customers. We also assume that every customer
makes a purchase everyday from either of the two merchants. We relax these two assumptions by
introducing a look-ahead factor that controls how far into the future a customer can perceive the
rewards. This a↵ects the customer behavior dynamics as follows: if the reward redemption threshold
is farther than the customer’s look-ahead parameter, (s)he is unable to perceive the future value
of that reward and take it into consideration while maximizing long term utility. This parameter,
being customer specific, adds heterogeneity to both the future discounting and purchase frequency –
customers having high purchase frequency might be able to perceive rewards with higher redemption
thresholds. We only model myopic and strategic customers, i.e., the look-ahead parameter being 0
or a large value, and leave further parametrization for future work. But importantly, the framework
we develop could be applied and modiefied to more complex look-ahead distributions.
In addition, we assume each customer has a visit probability bias with which (s)he purchases
the good from the reward program merchant for reasons exogenous to utility maximization. This
behavior may be attributed to excess loyalty (Fader and Schmittlein [1993], Sharp and Sharp [1997])
which has been argued as an important parameter for the success of any reward program, or it
may be attributed to price insensitivity of customers; whenever a customer is price insensitive, (s)he
strictly prefers to purchase from the reward program merchant as (s)he gains points redeemable
for rewards in the future. There are many possible reasons for customers’ price insensitivity: the
reward program merchant could be o↵ering some other monopoly products, or the customer might
be getting reimbursed for some purchases as part of corporate perks (eg: corporate travel). As an
e↵ect, this visit probability bias controls how frequently the customer’s points increase even when
(s)he does not actively choose to make purchases from the reward program merchant. Both the
look-ahead and excess loyalty parameters can be attributed to bounded rationality of customers and
have been argued to be important factors toward customer choice dynamics, as discussed above in
the related work.
Results Overview
We formulate the customer choice dynamics as a dynamic program with the state being the number
of points collected from the reward program merchant. When the customer does not make biased
CHAPTER 4. OPTIMAL DESIGN OF A FREQUENCY REWARD PROGRAM 36
visits to the reward program merchant, (s)he compares the immediate utility of purchasing the good
at a cheaper price with the long term utility of waiting and receiving the time discounted reward to
make a purchase decision. The solution to the customer’s dynamic program gives conditions for the
existence and achievability of a phase transition: a points threshold before which the customer visits
the merchant o↵ering rewards only due to the visit probability bias, and after which (s)he adopts
the program and always visits the merchant o↵ering rewards till receiving the reward. We show that
this phase transition occurs sooner for strategic customers. Increasing the reward value also makes
the phase transition occur earlier. However, increasing the points threshold required to redeem the
reward or the price discount o↵ered by the traditional pricing merchant delays this tipping point. In
short, these results verify that our model is in coherence with the di↵erent psychological constructs
as discussed in the related work section: purchase acceleration closer to reward redemption and a
tipping point before which purchases are only due to the loyalty bias.
After characterizing the customer behavior dynamics in our model, we optimize over the long
run revenues that the reward program merchant achieves. We model a specific case of propor-
tional promotion budgeting: the reward o↵ered by the reward program merchant is proportional
to the product of the distance to the reward and the discount provided by the traditional pricing
merchant, with the proportionality constant being another parameter in the design of the reward
program. We show that under proportional promotion budgeting, the optimal distance to reward
and the proportionality budgeting constant follow an intuitive product relationship which is inde-
pendent of the customer population parameters, and these values correspond closely to real world
observed cashback percentage values. In addition, optimizing the revenue objective gives the same
optimal distance to reward as minimizing the phase transition point as defined above. Moreover, we
characterize the conditions in terms of the customer parameters for when the revenue objective of
the reward program merchant is better than the traditional pricing merchant and when it is better
for the reward program merchant to o↵er a reward versus not o↵ering any reward, for a specific
choice of loyalty bias distribution. We show that for the reward program to be e↵ective under both
the above conditions, a minimum fraction of customer population must be strategic. And there is a
specific range of values of the loyalty bias between 0 and 1 corresponding to the fraction of strategic
customers for the reward program to be strictly better for the merchant.
4.1 Model
We index the two competing merchants selling identical goods as A and B. Without loss of generality,
we assume that A sells the good for a price of 1 dollar while B sells it for 1�v dollars, i.e., B o↵ers a
discount of v dollars. A on the other hand o↵ers a reward of value R dollars to a customer after (s)he
makes k purchases at A. We investigate only the case that we refer to as “proportional promotion
budgeting” wherein this reward R is proportional to the product of the distance to the reward k
CHAPTER 4. OPTIMAL DESIGN OF A FREQUENCY REWARD PROGRAM 37
and the discount v provided by B. That is R = ↵kv where ↵ is assumed to be a constant.
4.1.1 Customer Behavior Model
We assume customers purchase the item from either A or B everyday, i.e., we ignore the heterogeneity
in frequency of purchases among the customers in our model and leave it for future work. We assume
customers have a linear homogenous utility in price: at price p the utility is ⌫(p) = 1 � p. This
reduces to customers getting an immediate utility of 0 from A and v from B. All customers have
the same time value of money as a discount factor of � lying between 0 and 1.
We denote a customer’s visit probability bias and the look-ahead parameter with � and t re-
spectively. That is with probability �, (s)he purchases from A due to externalities and perceives a
future reward only if it is within t purchases away. We assume � for a customer to be drawn from a
uniform distribution between [0, b], where b is between 0 and 1. And we focus on a simple threshold
distribution for the look-ahead parameter t:
t =
8<
:t1
, wp p,
0, wp 1� p.
The above distribution intuitively means that the customers are either myopic and focus only
on immediate rewards, or are far-sighted (we assume t1
is large). We model the customer’s decision
problem as a dynamic problem. We index the number of purchases the customer makes from A
until reward by i, for 0 i k � 1, and we refer a customer to be in state i after having made i
purchases from A. At state i, the customer has two possibilities:
1. With probability �, the customer must visit A, and (s)he is now in state i+ 1.
2. With probability 1 � �, the customer may purchase from B for an immediate utility v and
remain in state i or purchase from A for no immediate utility but move to state i+ 1.
Let V (i) denote the long term expected reward at state i. Then we model the decision problem
as the following dynamic program.
V (i) = ��V (i+ 1) + (1� �)max{v + �V (i),�V (i+ 1)} for 0 i k � 1
V (k) = R
We show that the decision process exhibits a phase transition; that is prior to some state, the
customer purchases from A only if (s)he must do so exogenously but after that state, (s)he always
decides to purchase from A. This phase transition point is independent of �, and depends only on
t, among the variable customer parameters. Hence we represent this phase transition point as i0
(t).
CHAPTER 4. OPTIMAL DESIGN OF A FREQUENCY REWARD PROGRAM 38
4.1.2 Merchant Objective
Given the above model of customer dynamics, we define the revenue objectives of A and B, where
A chooses its parameters and B is non-strategic. We define the rate of revenue for a merchant from
a customer as the expected time averaged revenue that the merchant receives within the customer’s
lifetime. For simplification, we assume merchants do not discount future revenues. As described
above, a customer’s dynamics are cyclic after each reward cycle. Thus, the lifetime dynamics of
customer behavior is a regenerative process with independent and identically distributed reward cycle
lengths. Let RoRA(c) and RoRB(c) denote the expected rate of revenues for A and B respectively
from a customer c’s lifetime. Let ⌧(t,�) denote the total number of purchases the customer makes
before reaching the phase transition point i0
(t). Then the length of the reward cycle (or total number
of purchases the customer makes before receiving the reward) is ⌧(t,�)+k� i0
(t), as after the phase
transition, (s)he makes all purchases from A only until (s)he receives the reward. In this cycle, the
number of visits that the customer makes to A are k, and to B are ⌧(t,�)� i0
(t). The revenue that
A earns in one such cycle is k�R and the revenue that B earns is (1� v)(⌧(t,�)� i0
(t)). Thus the
rate of revenues for A and B from the customer c are as follows:
RoRA(c) = E⌧
k �R
⌧(t,�) + k � i0
(t)
�
RoRB(c) = E⌧
(⌧(t,�)� i
0
(t))(1� v)
⌧(t,�) + k � i0
(t)
�
Since the process for a single customer is regenerative, using the reward renewal theorem (Cinlar
[1969]), we can take the expectation over the cycle length inside the numerator and denominator
respectively. Note that E⌧[⌧(t,�)] = i0(t)
� as before reaching the phase transition point, with proba-
bility �, the number of purchases by the customer from A increases by 1, and with probability 1��
it stays constant. Then taking the expectation over the customer population, the overall rate of
revenues for both A and B are as follows:
RoRA = E�,t
k �R
i0
(t)/�+ k � i0
(t)
�(4.1)
RoRB = E�,t
(i
0
(t)�� i0
(t))(1� v)
i0
(t)/�+ k � i0
(t)
�(4.2)
4.2 Results
4.2.1 Customer Choice Dynamics
We first show that every customer exhibits the following behavior: until (s)he reaches the phase
transition point i0
(t), she purchases from A only due to the exogeneity parameter, and after that
CHAPTER 4. OPTIMAL DESIGN OF A FREQUENCY REWARD PROGRAM 39
(s)he always purchases from A till she receives the reward. This behavior is cyclic, and repeats after
every reward redemption.
Lemma 4. V (i) is an increasing function in i if the following condition holds:
R >(1� �)v
1� �(4.3)
And further, V (i) can be evaluated as:
V (i) = max
⇢��V (i+ 1) + (1� �)v
1� (1� �)�,�V (i+ 1)
�(4.4)
Proof. Proof. First we show that V (i) is an increasing function in i by induction. We first show
that if the condition above is satisfied, V (k � 1) < V (k) = R. Suppose not, so V (k � 1) � R. Then
from Eq. ??, we have:
V (k � 1) = ��V (k) + (1� �)(v + �V (k � 1))
=��R+ (1� �)v
1� (1� �)�
<��R+ (1� �)R
1� (1� �)�
=R(1� (1� �)�)
1� (1� �)�= R
But this is a contradiction, so V (k�1) < V (k). Now assume V (i+1) < V (i+2) for some i < k�2,
we will show that this implies V (i) < V (i+ 1). Suppose not, so V (i) � V (i+ 1). As we did before
we may upper bound V (i).
V (i) = ��V (i+ 1) + (1� �)(v + �V (i))
(1� �)v + �V (i)
() V (i) (1� �)v
1� �
But because V (i+ 1) < V (i+ 2), we may lower bound V (i+ 1).
V (i+ 1) � ��V (i+ 2) + (1� �)(v + �V (i+ 1))
= (1� �)v + (1� �)�V (i+ 1) + ��V (i+ 2)
> (1� �) + �V (i+ 1)
() V (i+ 1) >(1� �)v
1� �
CHAPTER 4. OPTIMAL DESIGN OF A FREQUENCY REWARD PROGRAM 40
Again, we have a contradiction, so V (i) < V (i+1), and V (i) is an increasing function in i. Now we
prove the second claim. We have the following:
V (i) = ��V (i+ 1) + (1� �)max{v + �V (i),�V (i+ 1)}
= max{��V (i+ 1) + (1� �)(v + �V (i)),�V (i+ 1)}
Assuming V (i) is the left term in the above maximum, we may solve the equation for that term.
V (i) = ��V (i+ 1) + (1� �)(v + �V (i))
(1� (1� �)�)V (i) = ��V (i+ 1) + (1� �)v
V (i) =��V (i+ 1) + (1� �)v
1� (1� �)�
And we get our claim.
Now if the expected reward of the customer increases with the number of purchases made from
A, we expect that at some number of purchases it becomes profitable for the customer to choose
to purchase from A as opposed to B. We characterize this phase transition point in the following
theorem.
Theorem 4.2.1. Suppose V (i) is an increasing function in i and consider a customer with look-
ahead parameter t. A phase transition occurs after (s)he makes i0
(t) visits to firm A, where i0
(t) is
given by:
i0
(t) =
8<
:k �� ⌘ i
0
, if t � �.
k � t, otherwise.(4.5)
with
� =
�log�
✓v
R(1� �)
◆⌫(4.6)
CHAPTER 4. OPTIMAL DESIGN OF A FREQUENCY REWARD PROGRAM 41
Proof. Proof. First we solve for the condition on V (i+ 1) for us to choose firm A over B willingly.
�V (i+ 1) >��V (i+ 1) + (1� �)v
1� (1� �)�
() �V (i+ 1)
✓1� �
1� (1� �)�
◆>
✓1� �
1� (1� �)�
◆v
() �V (i+ 1)
✓1� (1� �)� � �
1� (1� �)�
◆>
✓1� �
1� (1� �)�
◆v
() �V (i+ 1)
✓(1� �)(1� �)
1� (1� �)�
◆>
✓1� �
1� (1� �)�
◆v
() �V (i+ 1) >v
1� �
() V (i+ 1) >v
�(1� �)
Let i0
be the minimum state i such that the above holds, so in particular V (i0
) v�(1��) but
V (i0
+ 1) > v�(1��) . We know because V is increasing in i, this point is indeed a phase transition:
V (i) > v�(1��) for all i > i
0
, so after this point, the customer always chooses firm A. We may
compute V (i0
) easily using this fact.
V (i0
) = �V (i0
+ 1) = · · · = �k�i0V (k) = �k�i0R
Thus, we have the following:
�k�i0 v
R�(1� �)< �k�(i0+1)
() k � i0
� log�
✓v
R�(1� �)
◆> k � (i
0
+ 1)
() i0
k � log�
✓v
R(1� �)
◆+ 1 < i
0
+ 1
() i0
= k ��log�
✓v
R(1� �)
◆⌫⌘ k ��
If t � �, the customer perceives the reward prior to this tipping point, so i0
(t) = i0
= k ��.
If t < �, the customer does not perceive the reward at this point, and immediately once (s)he
perceives the reward, (s)he is beyond this point and adopts the reward program, so t = k � t. The
above dependence reduces to the following after incorporating our specific look-ahead distribution:
i0
(t) =
8<
:i0
, wp p,
k, wp 1� p.
CHAPTER 4. OPTIMAL DESIGN OF A FREQUENCY REWARD PROGRAM 42
Note that the phase transition point is independent of �, the customer’s visit probability bias
toward the merchant. As we would expect, it increases with the look-ahead parameter, and with
the price discount o↵ered by merchant B. Additionally, it decreases with an increase in the reward
value (R) and a decrease in the distance to reward (k). The variation with the discount factor �
is interesting: we can show that for any Rv � 1 there exists a � 2 [0, 1] that minimizes the phase
transition point i0
for strategic customers. We refer to the ratio of number of visits required for a
forward-looking customer to adopt a reward program and the total distance to the reward as the
“influence zone”. Intuitively this is the fraction of visits that the merchant wants to influence the
customer by o↵ering exogenous means of earning additional points like bonus miles in airlines or
accelerated earnings, as discussed in the introduction. Next we find the optimal k for minimizing
this influence zone if ↵ is constant.
Remark 1. Influence zone is minimized at k = e↵(1��) under proportional promotion budgeting, as
long as � is close to 1.
Proof. Proof. As defined the influence zone is i0k = k��
k = 1 � �
k . Thus minimizing the influence
zone is equivalent to minimizing k�
.
k
�=
k
log�
⇣1
↵k(1��)
⌘ ⇠ k(1� �)
log(↵k(1� �))
The above approximation relies on � close to 1. Now this value is minimized at k = e↵(1��) .
Therefore, for all distributions of excess loyalty, the optimal value for k is given by e↵(1��) , the value
for which k�
is minimized and takes the value e↵ . At this value the influence zone takes the value
1� ↵e .
Note that if ↵ is 1, then the value of k corresponds to a cashback between 2% and 4% as � ranges
between 0.95 and 0.9. This value is realistic to what is observed in practice.
4.2.2 Merchant Objective Dynamics
Optimizing Reward Parameters
So far we have characterized the customer behavior within the duopoly without concern about the
particular reward design parameters. In this section, we derive optimal parameters for the reward
program design with the objective of maximizing the revenue of the reward program merchant.
Interestingly, we see that maximizing revenue corresponds to minimizing the influence zone, as
illustrated above.
Theorem 4.2.2. Under proportional promotion budgeting, the optimal reward distance (k) and the
optimal budget proportion (↵) for merchat A follow the relation ↵k = e(1��) for all distributions of
CHAPTER 4. OPTIMAL DESIGN OF A FREQUENCY REWARD PROGRAM 43
� as long as � is close to 1.
Proof. Proof. Let ✓ = �
k . First, we evaluate RoRA. We substitute the value of the phase transition
point obtained above in the rate of revenue equation for A to reevaluate it. And since we assume that
� and t are drawn independent of each other, we can separate the expectation terms and evaluate
them sequentially, first over t, then over �.
RoRA =E�,t
k �R
i0
(t)/�+ k � i0
(t)
�
=E�
p · k �R
i0
/�+ k � i0
+ (1� p)�(k �R)
k
�
=E�
p · �(k �R)
k�+ i0
(1� �)+ (1� p)
�(k �R)
k
�
=E�
p · �(1� ↵v)
1� ✓(1� �)+ (1� p)�(1� ↵v)
�
Observe that the term inside the expectation is maximized when ✓ is maximized for all values
of � 2 (0, 1). Using Leibniz’ Rule, we can conclude that the integral itself is maximized when ✓
is maximized, which as shown above, is equivalent to minimizing the influence zone. As shown in
Remark 1, this happens at ↵k = e1�� . And at this point, ✓ = �
k = ↵e .
An interesting point to observe above is that if ↵ is constant, then maximizing the revenue
objective is equivalent to minimizing the influence zone. This result matches the following intuition
- the faster the merchant can get customers to adopt the reward program, the more purchases they
will make with the merchant in the long run - but is stronger as it actually maximizes the revenue
objective as well. Although, reward point accelerations are common and e↵ective mechanisms to
get customers to adopt reward programs, we have shown that designing the reward program so that
a minimum number of such accelerations is required leads to maximizing merchant’s revenue. The
condition that � be close to 1 is not very restrictive, as the discount factor is expected to be high
in most cases. Note that because k � �, the above also shows ↵ e. Finally, observe that we
need R > (1��)v1�� for V to be increasing. We meet this condition with proportional budgeting when
k = e↵(1��) as R = ↵kv = ev
1�� � v1�� � (1��)v
1�� .
The above framework can be used for optimizing for the reward parameters to maximize A’s
rate of revenue, for varying distributions of the customer population. That is, if a merchant has a
good estimate of its customer population’s distribution, it can easily utilize the above theorem to
optimize its reward scheme. We leave the competitive study where merchant B could strategize on
its discount value v for future work.
Figure 4.1 shows how the rate of revenue of the reward program merchant varies with ↵, after
fixing ↵k as in our previous theorem, for various distributions of the loyalty bias parameter. We
CHAPTER 4. OPTIMAL DESIGN OF A FREQUENCY REWARD PROGRAM 44
(a) Uniform distribution. (b) Normal distribution.
(c) Logit-normal distribution.
Figure 4.1: Rate of revenue for reward merchant as a function of ↵ (with k = e↵(1��) ) for di↵erent
distributions. For all distributions, � = 0.9, p = 0.9 and v varies as labeled. The uniform distributionis on (0, b] with b = 0.9; the normal distribution has µ = 0.5 and � = 0.1; and the logit-normaldistribution is the standard on [0, 1].
observe three general patterns for RoRA: for large values of v, RoRA decreases along all feasible
values of ↵; for small values of v, it increases for all values of ↵; and for some values of v in between,
it is convex with a minimizing ↵ in (0, e). That is, the rate of revenue for A is maximized at ↵
being 0 or e, and no maximizer exists between 0 and e across distributions. We believe this to
be true for all important distributions, similar to what our simulations suggest. Thus, the reward
program merchant only needs to decide between not o↵ering the reward (setting ↵ as 0) or o↵ering
the highest feasible reward (setting ↵ to e) in our model. Note that the exact values of v for which
these patterns occur also depend on p and the parameters of the specific distribution for loyalty
bias. In the following subsection, we explore these conditions in detail for the uniform distribution
of loyalty bias for fixed ↵.
CHAPTER 4. OPTIMAL DESIGN OF A FREQUENCY REWARD PROGRAM 45
Revenue Comparisons
We characterize the conditions for when it is strictly better for A to o↵er a reward program for a
specific distribution of the loyalty bias parameter - when � for every customer is drawn uniformly at
random between (0, b] where b is less than 1. We will assume this distribution for the remainder of
the section. This condition boils down to two situations: first, the rate of revenue for A should be
higher than that of B and second, that the rate of revenue for A should be higher than it could have
achieved by not o↵ering the reward program at the same fixed price. First, we evaluate the expected
rates of revenue for both A and B under the optimality relation between k and ↵ mentioned above
with � being drawn from a uniform distribution.
RoRA =E�
p · �(1� ↵v)
1� ✓(1� �)+ (1� p)�(1� ↵v)
�
=pk · 1� ↵v
�·✓1� k ��
b�log
✓1 +
b�
k ��
◆◆+ (1� p)
bk(1� ↵v)
2k
=(1� ↵v)
✓pe
↵
✓1� e� ↵
b↵log
✓1 +
b↵
e� ↵
◆◆+ (1� p)
b
2
◆
RoRB =E�,t
(i
0
(t)�� i0
(t))(1� v)
i0
(t)/�+ k � i0
(t)
�
=E�
p · (i0/�� i
0
)(1� v)
i0
/�+ k � i0
+ (1� p)(k/�� k)(1� v)
k/�
�
=E�
p · i0(1� �)(1� v)
k�+ i0
(1� �)+ (1� p)(1� �)(1� v)
�
=p · i0
(1� v)
b(k � i0
)2
✓k log
✓1 +
b(k � i0
)
i0
◆� b(k � i
0
)
◆+ (1� p)(1� b
2)(1� v)
=p · i0(1� v)
k � i0
✓k
b(k � i0
)log
✓1 +
b(k � i0
)
i0
◆� 1
◆+ (1� p)(1� b
2)(1� v)
=(1� v)
✓p · e� ↵
↵
✓e
b↵log
✓1 +
b↵
e� ↵
◆� 1
◆+ (1� p)(1� b
2)
◆
=(1� v)
✓pe
↵
✓e� ↵
b↵log
✓1 +
b↵
e� ↵
◆� e� ↵
e
◆+ (1� p)(1� b
2)
◆
Observe that both the above equations have a left term and a right term. The left term is the
rate of revenue obtained from strategic customers whereas the right term is that obtained from the
myopic customers. As ↵ ranges between 0 and e, the value on the left term increases from 0 for
RoRA and decreases to 0 for RoRB . That is, by controlling the reward budget ratio, A is able to
gain the entire strategic customer base. But observe how RoRA varies with ↵: the marginal revenue
term (1� ↵v) decreases with ↵ as the merchant gives higher rewards to customers, but the market
CHAPTER 4. OPTIMAL DESIGN OF A FREQUENCY REWARD PROGRAM 46
share term increases as A gains more strategic customer base. As ↵ ! 0, RoRA ! b/2, i.e., the
revenue earned is only due to the loyalty bias, and is equivalent to the reveue earned by A when not
running any reward program.
Figure 4.2 illustrates the region in terms of the customer parameters (b, p) where it is better for
A to o↵er a reward program, i.e., RoRA > RoRB (indicated in blue) and RoRA > b2
(indicated in
yellow) for di↵erent values of ↵, keeping v = 0.05 and � = 0.95 fixed. The blue region shows that
there is a clear threshold of b and p values beyond which RoRA > RoRB . But more interestingly,
the threshold value of b and p decreases as ↵ is increased toward e. Whereas the yellow region shows
that if the fraction of strategic customers is not too small, the firm should choose to run a reward
program most of the time except for when b is large; larger b values mean that customers make more
exogenous visits, so a reward program is no longer needed to entice visits, but only decreases the
profits of the reward program merchant. The intersection of two regions, i.e., the region in green,
indicates that the range of values of b for which the reward program is strictly profitable increases
as p increases. We formally show this result next.
For any fixed ↵, the exact conditions on p, b and v for RoRA > RoRB and RoRA > b2
are rather
complex. We will first focus on one particular simple case: ↵ ! e.
Lemma 5. As ↵ ! e, RoRA > RoRB if and only if the following condition on b holds:
b > 2 ·(1� v)� p
1�p · (1� ev)
(1� v) + (1� ev)(4.7)
Proof. Proof. First we compute the following quantity.
lim↵!e
e� ↵
b↵log
✓1 +
b↵
e� ↵
◆
Let e�↵b↵ = x, then it is easy to see that the above limit is equivalent to limx!1
log(1+x)x = 0. Then
as ↵ ! e, we have the following expressions for RoRA and RoRB .
RoRA = (1� ev)
✓p+ (1� p)
b
2
◆
RoRB = (1� v)(1� p)
✓1� b
2
◆
CHAPTER 4. OPTIMAL DESIGN OF A FREQUENCY REWARD PROGRAM 47
(a) ↵ = 0.5 (b) ↵ = 1
(c) ↵ = 2 (d) ↵ = 2.5
Figure 4.2: Regions where RoRA > RoRB (blue), where RoRA > b2
(yellow) and where both aretrue (green) for di↵erent values of ↵. In all cases, � = 0.95, v = 0.05 and � drawn uniformly on(0, b].
And our condition RoRA > RoRB simplifies.
(1� ev)
✓p+ (1� p)
b
2
◆> (1� v)(1� p)
✓1� b
2
◆
b
2(1� p)(1� ev + 1� v) > (1� v)(1� p)� (1� ev)p
b > 2 ·(1� v)� p
1�p · (1� ev)
(1� v) + (1� ev)
The above lemma gives a lower bound on b for RoRA > RoRB in terms of p and v. In order for
the reward program to be strictly better than the traditional pricing model, we also need RoRA > b2
.
The following lemma shows that this condition gives a corresponding upper bound on b.
CHAPTER 4. OPTIMAL DESIGN OF A FREQUENCY REWARD PROGRAM 48
Lemma 6. As ↵ ! e, RoRA > b2
if and only if the following condition on b holds:
b <2p
p+ ev1�ev
(4.8)
Proof. Proof. The condition RoRA > b2
is equivalent to:
(1� ↵v)
✓pe
↵
✓1� e� ↵
b↵log
✓1 +
b↵
e� ↵
◆◆+ (1� p)
b
2
◆>
b
2
e
↵
✓1� e� ↵
b↵log
✓1 +
b↵
e� ↵
◆◆>
b
2p
✓1
1� ↵v� (1� p)
◆
As ↵ ! e, the left term above approaches 1 and we are left with:
b <2p(1� ev)
1� (1� p)(1� ev)
=2p(1� ev)
p� pev + ev
=2p
p+ ev1�ev
The previous two lemmas provide lower and upper bounds on b for RoRA > RoRB and RoRA >b2
, respectively. For the reward program to be strictly better than all alternatives, both of these
conditions must be met. We combine them to get an intuitive necessary and su�cient condition on
p for the reward program to be “strictly better”.
Lemma 7. As ↵ ! e, for the reward program to be strictly better on some values of b, a necessary
and su�cient condition on p is:
p > 1� 1� ev
1� ev2(4.9)
Proof. Proof. The values of b for which both previous lemmas are met is given by:
2 ·(1� v)� p
1�p · (1� ev)
(1� v) + (1� ev)< b <
2p
p+ ev1�ev
The above inequality is only valid when the lower bound is less than the upper bound. We may
CHAPTER 4. OPTIMAL DESIGN OF A FREQUENCY REWARD PROGRAM 49
Figure 4.3: The upper and lower bounds on b as a function of p. Here v = 0.05 and ↵ ! e.
manipulate this inequality to get the simple condition on p in our claim.
2 ·(1� v)� p
1�p · (1� ev)
(1� v) + (1� ev)<
2p
p+ ev1�ev✓
p+ev
1� ev
◆✓(1� v) +
p
1� p(1� ev)
◆< p(1� v + 1� ev)
(1� v)ev
1� ev< p(1� ev) +
p2
1� p(1� ev) +
p
1� pev
(1� p)(1� v)ev
1� ev< (1� p)p(1� ev) + p2(1� ev) + pev
(1� v)ev
1� ev< p
✓1 + (1� v)
ev
1� ev
◆
(1� v)ev
(1� ev) + (1� v)ev< p
ev � ev2
1� ev2< p
(1� ev2)� (1� ev)
1� ev2< p
1� 1� ev
1� ev2< p
Thus, for any choice of v, and p obeying the above condition, the combination of the above
lemmas gives an interval of b values for which the reward program is the most profitable choice for
the merchant. Figure 4.3 shows the bounds on b for varying values of p, keeping v = 0.05 fixed, and
restricting the range of b values in (0, 1). Notice that the upper bound on b increases as a function
of p while the lower bound decreases with p, so the interval of b values where the reward program is
CHAPTER 4. OPTIMAL DESIGN OF A FREQUENCY REWARD PROGRAM 50
strictly better increases with p. We formalize this observation in the next lemma.
Lemma 8. As ↵ ! e and p obeying Eq. 4.9, as p increases, the range of values of b for which the
reward program is strictly better increases.
Proof. Proof. We know that the range of b values in which we are interested is given by the interval.
2 ·(1� v)� p
1�p · (1� ev)
(1� v) + (1� ev)< b <
2p
p+ ev1�ev
Because p obeys eq. 4.9, the above inequality is valid. We will show that the above upper bound
increases with p and the lower bound decreases with increasing p. Therefore, as p increases, the
interval of b values for which the reward program grows. First consider the upper bound, UB(p) =2p
p+ ev1�ev
.
UB0(p) =ev
(1� ev)⇣p+ ev
1�ev
⌘2
� 0, 8p
Now we consider the lower bound, LB(p) = 2 · (1�v)� p1�p ·(1�ev)
(1�v)+(1�ev) .
LB0(p) = � 2(1� ev)
(1� p)2((1� v) + (1� ev)) 0, 8p
Figure 4.4 shows the upper and lower bounds on b for all valid pairs of p and v with ↵ ! e. The
top plot shows the lower bound on b and the bottom plot depicts the upper bound. For a particular
(p, v) pair, if the color on the top plot is darker than the corresponding color on the bottom plot,
then this pair has a valid b interval in which the reward program is strictly better. This figure also
exhibits the increasing range of b values with increasing p; for large values of p and moderate values
of v, we observe no restrictions on b for the reward program to be strictly better. We combine all
the above observations into the following theorem.
Theorem 4.2.3. Under proportional budgeting, as ↵ ! e, a necessary and su�cient condition for
the reward program to be strictly better is a lowerbound on p which increases with v. And as p
increases beyond the lowerbound, the region of allowable b for which the reward program is strictly
better becomes larger.
Now we generalize the above result for all values of ↵. The conditions are more complex but the
results and intuitions are similar. The proofs are technical, and we leave them to the appendix.
Lemma 9. Fix ↵ 2 (0, e). For any (p, v) pair, there exists some upper bound b1
2 [0, 1] such that
for all b b1
, RoRA � b2
.
CHAPTER 4. OPTIMAL DESIGN OF A FREQUENCY REWARD PROGRAM 51
Figure 4.4: Bounds on b for various values of p and v at ↵ ! e. Top shows lower bounds on b forRoRA � RoRB and bottom shows upper bounds of b for RoRA � b
2
.
Lemma 10. Fix ↵ 2 (0, e). For any (p, v) pair, there exists some lower bound b0
2 [0, 1] such that
for all b � b0
, RoRA > RoRB.
We combine the above two lemmas as before to get the following theorem.
Theorem 4.2.4. Fix ↵ 2 (0, e). For any value of v, there exists a lowerbound p0
such that for any
p greater than p0
, there exists a range (b0
, b1
) between 0 and 1 such that for all b lying between b0
and b1
, o↵ering the reward program is strictly better for A.
The above results can be extremely helpful in the following way: if a merchant estimates that
the loyalty bias parameter is drawn from a uniform distribution and has good estimates of its target
customer population, i.e., b and p values, it can find the appropriate reward budget ratios ↵, which
could make running a reward program strictly better against a traditional pricing competitor. More
importantly, these results show that under mild assumptions on the customer population parameters,
reward programs can be beneficial in the competitive duopoly model.
4.3 Conclusions
We investigated the optimal design of a frequency reward program against traditional pricing in
a competitive duopoly. We modeled the behavior of customers valuing their utilities in rational
economic terms, and our theoretical results agree with past empirical studies. Assuming general
distributions of customer population, we characterized optimal parameters for the design of reward
CHAPTER 4. OPTIMAL DESIGN OF A FREQUENCY REWARD PROGRAM 52
program, and under more specific parameter distrubution assumptions, we showed the conditions
on customer population parameters which make the reward program strictly better. In short, if a
merchant can make good estimates of the customer population parameters, our model and results
can help understand the pros and cons of running a frequency reward program for that merchant
against traditional pricing.
Though our research o↵ers some interesting managerial insights, there are some limitations to
our study. Our results on revenue comparisons assumed specific distributions for the customer pop-
ulation, though our framework can be extended to other distributions as well. Moreover, estimating
the customer population distribution and parameters using real transactional data is an interesting
question in itself. That is, backing this research with empirical and experimental study, could pro-
vide strong quantifications to the intuitions we discuss. We modeled customer behavior in rational
economic terms, mainly to understand the rational components that a↵ect the decision making pro-
cess. Tying in the e↵ects of our research with some past models on psychological behavior patterns
of customers toward reward programs would be another practically relevant problem to address. Fi-
nally, we modeled a competitive duopoly, but left the traditional pricing merchant as non-strategic.
Understanding how competition a↵ects the equilibrium prices and reward program parameters could
give intuitions about a more practical scenario.
CHAPTER 4. OPTIMAL DESIGN OF A FREQUENCY REWARD PROGRAM 53
4.4 Appendix
4.4.1 Proof of Lemma 9
Proof. Proof. We delay the proof of this lemma to first prove a helpful proposition. It is a straight-
forward computation to see that the condition of RoRA � b2
is equivalent to:
1
b
✓1� e� ↵
b↵log
✓1 +
b↵
e� ↵
◆◆� ↵(1� (1� p)(1� ↵v))
2pe(1� ↵v)
() g(b;↵) � h(p, v;↵)
where we have defined functions g(b) and h(p, v) for fixed ↵ for the above inequalities.
Proposition 4. For a fixed ↵, g(b) is non-increasing for all b 2 (0, 1).
Proof. Proof. We take the derivative of g:
g0(b) =2(e� ↵)
b3↵log
✓1 +
b↵
e� ↵
◆� 1
b2� 1
b2⇣1 + b↵
e�↵
⌘ 0
() 2(e� ↵)
b↵log
✓1 +
b↵
e� ↵
◆ 1 +
1
1 + b↵e�↵
() 2 log(1 + x)
x 1 +
1
1 + x
where x = b↵e�↵ , and as b 2 (0, 1), x 2 (0, ↵
e�↵ ). We can see that as x ! 0, the above inequality is
an equality. We represent the LHS of the above equation as L(x) and RHS as R(x). Next we show
that L(x) decreases more quickly than R(x) does for positive x, thereby proving the proposition.
First show that in the range of x the following holds true:
✓2� 1
1 + x
◆2
2 log(1 + x) + 1 (4.10)
To show the above observe that at x ! 0 both the LHS and RHS are equal. And it is easy to show
that the derivative of LHS is lower than the derivative of RHS for all x � 0 as shown.
(1 + x) +1
1 + x� 2
=) 2� 1
1 + x 1 + x
=)✓2� 1
1 + x
◆· 1
1 + x 1
=) 2 ·✓2� 1
1 + x
◆·✓
1
1 + x
◆2
2
1 + x
CHAPTER 4. OPTIMAL DESIGN OF A FREQUENCY REWARD PROGRAM 54
The left hand side is the derivative of the above LHS and right hand side is the derivative of the
above RHS.
Now we can rearrange Eq. 4.10 as follows:
✓2� 1
1 + x
◆2
2 log(1 + x) + 1
=)✓1 +
x
1 + x
◆2
2 log(1 + x) + 1
=)✓
x
1 + x
◆2
+2x
1 + x 2 log(1 + x)
=) 2
✓x
1 + x� log(1 + x)
◆ �
✓x
1 + x
◆2
=)2⇣
x1+x � log(1 + x)
⌘
x2
�✓
1
1 + x
◆2
The left hand side of above is L0(x) and right hand side is R0(x).
Thus, g(b) is decreasing in b, so for any (p, v) pair, we may compute h(p, v;↵), which will then
fall into one of the following three cases.
• h(p, v;↵) � g(0). So no value of b makes the reward program profitable.
• h(p, v;↵) g(1). So any value of b makes the reward program profitable.
• h(p, v;↵) = g(b0
) for some b0
2 (0, 1). So the reward program is profitable for all b b0
and
not otherwise.
The above proposition and discussion proves our lemma: for fixed ↵ and any (p, v) pair, there is
some upperbound on b s.t. RoRA > b2
.
4.4.2 Proof of Lemma 10
Proof. Proof. Let b↵e�↵ = x. Then RoRA > RoRB can be evaluated as follows:
p e↵
⇣1� log(1+x)
x
⌘(1� ↵v + 1� v)� p(1� v) + (1� p) b
2
(1� ↵v + 1� v) + p(1� v) > 1� v
=) p⇣1� log(1+x)
x
⌘+ (1� p) b↵
2e > ↵e · 1�v
1�↵v+1�v (4.11)
Since ↵ is a constant, the LHS above is a function of b and p. Let the LHS above be L(b, p). We
first show that in the range of b 2 [0, 1], 1� log(1+x)x > b↵
2e which shows that L(b, p) is increasing in
p.
CHAPTER 4. OPTIMAL DESIGN OF A FREQUENCY REWARD PROGRAM 55
1� log(1 + x)
x>
b↵
2e
,x� log(1 + x) >b2↵2
2e(e� ↵)
Observe that LHS is equal to RHS when b ! 0. All we show is that LHS increases faster than RHS
in the range of b 2 [0, 1].
,✓1� 1
1 + x
◆↵
e� ↵>
b↵2
e(e� ↵)
, 1
1 + x>
e� ↵
e
, e
e� ↵> 1 +
b↵
e� ↵
And the last equation is true in the range of b 2 [0, 1]. Hence L(b, p) increases with p.
Now we show that L(b, p) increases with b as well. First observe:
@L(b, p)
@b= p
✓log(1 + x)� x
1+x
x2
◆↵
e� ↵+ (1� p)
↵
2e
Thus @L(b,p)@b > 0 implies:
(1� p)↵
2e> p
✓ x1+x � log(1 + x)
x2
◆↵
e� ↵
,(1� p)b2↵2
2e(e� ↵)> p
✓1� 1
1 + x� log(1 + x)
◆
Again the LHS and RHS are equal as b ! 0. All we show again is that LHS increases faster as
compared to RHS.
, (1� p)b↵2
e(e� ↵)> p
✓1
(1 + x)2� 1
1 + x
◆↵
e� ↵
Clearly RHS is negative when b 2 (0, 1] and LHS is positive. Hence proved.
Thus L(b, p) is increasing in both b and p. And the condition required is L(b, p) is greater than
some constant value which depends on v. Hence for any v there exists a smooth (b0
, p0
) curve such
that for all b � b0
and p � p0
revenue rate of reward program merchant is larger.
Chapter 5
Future Directions
Loyalty reward programs constitute a large portion of the retail industry. In this thesis, we modeled
three aspects of loyalty reward programs: strategic network formation of coalitions; an alternate
decentralized methodology to settle transactions within coalition loyalty programs; and customer
choice dynamics in a competitive duopoly of a frequency reward program and traditional pricing
merchant. We showed that conducting Nash Bargaining between pairs of merchants is a strong tool
to negotiate the formation of coalitions. Our model for settling credits in coalition loyalty programs
has properties which make it a viable alternative. And we formulate conditions for when it is optimal
for a merchant to o↵er a reward program against traditional pricing.
Though this thesis provides a holistic overview of di↵erent aspects of loyalty reward programs,
there are many limitations pointing toward future work. In Chapter 2, we investigated the negoti-
ation between di↵erent merchants in coalition loyalty programs, but ignored many business aspects
like complementarity of route structures, government regulations, existing business ties, to name a
few. Moreover, we assumed many aspects of customer behavior to be exogenous – for instance, in our
model, customers apriori chose their base merchant, into whose currency they converted currency
earned from other merchants. But this choice could very well depend on the coalition structures
themselves. Also, the pricing of available services at di↵erent merchants heavily depends on the
demand and supply gap, which we did not take into consideration. Endogenizing some of these
aspects into a more holistic model would be an interesting direction for future work. One important
open problem in the credit settlement process we introduced in Chapter 3 is the long term stabil-
ity of the network – i.e., depending on di↵erent transaction regimes initiated by customers, what
exchange rates between merchants could lead to long term transactional liquidity. Additionally,
these transaction regimes could be dynamic themselves, for instance, they could depend on price
fluctuations, and not be exogenously given. In Chapter 4, we discussed how competitive pricing
a↵ects customer behavior toward the reward program, and formulated an optimal design of a stan-
dalone reward program. But competitive pricing would also a↵ect the customer demand for di↵erent
56
CHAPTER 5. FUTURE DIRECTIONS 57
merchant currencies in a coalition loyalty program, which in turn could influence the formation of
coalitions and the settlement process between di↵erent merchant partners. In conclusion, a frame-
work combining all the above mentioned aspects, i.e., to understand customer behavior in coalition
loyalty programs, its e↵ects toward setting up optimal reward schemes and formation of coalitions
with strategic merchants, could provide e↵ective mathematical machinery toward automating some
marketing processes in the retail sector.
Bibliography
Ashton Anderson, Daniel Huttenlocher, Jon Kleinberg, and Jure Leskovec. Steering user behavior
with badges. In Proceedings of the 22Nd International Conference on World Wide Web, WWW
’13, pages 95–106, New York, NY, USA, 2013. ACM. ISBN 978-1-4503-2035-1. doi: 10.1145/
2488388.2488398. URL http://doi.acm.org/10.1145/2488388.2488398.
Venkatesh Bala and Sanjeev Goyal. A noncooperative model of network formation. Econometrica,
68(5):1181–1229, 2000.
Je↵ Berry. The 2013 colloquy loyalty census. COLLOQUY Industry Report, 2013. URL http:
//www.colloquy.com/files/2013-COLLOQUY-Census-Talk-White-Paper.pdf.
Antoni Calvo-Armengol and Matthew O Jackson. Networks in labor markets: Wage and employment
dynamics and inequality. Journal of economic theory, 132(1):27–46, 2007.
Yuheng Cao, Aaron L Nsakanda, Moustapha Diaby, and Michael J Armstrong. Rewards-supply
planning under option contracts in managing coalition loyalty programmes. International Journal
of Production Research, 53(22):6772–6786, 2015.
Michael T Capizzi and Rick Ferguson. Loyalty trends for the twenty-first century. Journal of
Consumer Marketing, 22(2):72–80, 2005.
So Yeon Chun, Dan Andrei Iancu, and Nikolaos Trichakis. Points for peanuts or peanuts for points?
dynamic management of a loyalty program. Dynamic Management of a Loyalty Program (May
19, 2015), 2015.
Erhan Cinlar. Markov renewal theory. Advances in Applied Probability, 1(2):123–187, 1969. ISSN
00018678. URL http://www.jstor.org/stable/1426216.
Pranav Dandekar, Ashish Goel, Ramesh Govindan, and Ian Post. Liquidity in credit networks: A
little trust goes a long way. In Proceedings of the 12th ACM conference on Electronic commerce,
pages 147–156. ACM, 2011.
58
BIBLIOGRAPHY 59
Pranav Dandekar, Ashish Goel, Michael P Wellman, and Bryce Wiedenbeck. Strategic formation of
credit networks. ACM Transactions on Internet Technology (TOIT), 15(1):3, 2015.
Dimitri do B DeFigueiredo and Earl T Barr. Trustdavis: A non-exploitable online reputation system.
In null, pages 274–283. IEEE, 2005.
Xavier Dreze and Joseph C Nunes. Using combined-currency prices to lower consumers perceived
cost. Journal of Marketing Research, 41(1):59–72, 2004.
Peter S Fader and David C Schmittlein. Excess behavioral loyalty for high-share brands: Deviations
from the dirichlet model for repeat purchasing. Journal of Marketing research, pages 478–493,
1993.
Ricardo Flores-Fillol and Rafael Moner-Colonques. Strategic formation of airline alliances. Journal
of Transport Economics and Policy (JTEP), 41(3):427–449, 2007.
Leilei Gao, Yanliu Huang, and Itamar Simonson. The influence of initial possession level on con-
sumers’ adoption of a collection goal: A tipping point e↵ect. Journal of Marketing, 78(6):143–156,
2014.
Arpita Ghosh, Mohammad Mahdian, Daniel M Reeves, David M Pennock, and Ryan Fugger. Mech-
anism design on trust networks. In Internet and Network Economics, pages 257–268. Springer,
2007.
Wesley R Hartmann and V Brian Viard. Do frequency reward programs create switching costs? a
dynamic structural analysis of demand in a reward program. Quantitative Marketing and Eco-
nomics, 6(2):109–137, 2008.
Clark L Hull. The goal-gradient hypothesis and maze learning. Psychological Review, 39(1):25, 1932.
Matthew O Jackson. A survey of network formation models: stability and e�ciency. Group Forma-
tion in Economics: Networks, Clubs, and Coalitions, pages 11–49, 2005.
Matthew O Jackson and Asher Wolinsky. A strategic model of social and economic networks. Journal
of economic theory, 71(1):44–74, 1996.
Dean Karlan, Markus Mobius, Tanya Rosenblat, and Adam Szeidl. Trust and social collateral. The
Quarterly Journal of Economics, 124(3):1307–1361, 2009.
Ran Kivetz, Oleg Urminsky, and Yuhuang Zheng. The goal-gradient hypothesis resurrected: Pur-
chase acceleration, illusionary goal progress, and customer retention. Journal of Marketing Re-
search, 43(1):39–58, 2006.
BIBLIOGRAPHY 60
Paul Klemperer. Competition when consumers have switching costs: An overview with applications
to industrial organization, macroeconomics, and international trade. The review of economic
studies, 62(4):515–539, 1995.
Praveen K. Kopalle and Scott Neslin. The economic viability of frequency reward pro-
grams in a strategic competitive environment. Tuck School of Business at Dart-
mouth Working Paper No. 01-02. Available at SSRN: https://ssrn.com/abstract=265431 or
http://dx.doi.org/10.2139/ssrn.265431, 2001.
Sergio G Lazzarini. The impact of membership in competing alliance constellations: Evidence on the
operational performance of global airlines. Strategic Management Journal, 28(4):345–367, 2007.
Michael Lewis. The influence of loyalty programs and short-term promotions on customer retention.
Journal of marketing research, 41(3):281–292, 2004.
Yuping Liu. The long-term impact of loyalty programs on consumer purchase behavior and loyalty.
Journal of Marketing, 71(4):19–35, 2007.
Paul Resnick and Rahul Sami. Sybilproof transitive trust protocols. In Proceedings of the 10th ACM
conference on Electronic commerce, pages 345–354. ACM, 2009.
Byron Sharp and Anne Sharp. Loyalty programs and their impact on repeat-purchase loyalty pat-
terns. International journal of Research in Marketing, 14(5):473–486, 1997.
Valeria Stourm, Eric T Bradlow, and Peter S Fader. Stockpiling points in linear loyalty programs.
Journal of Marketing Research, 52(2):253–267, 2015.
Gail Ayala Taylor and Scott A Neslin. The current and future sales impact of a retail frequency
reward program. Journal of Retailing, 81(4):293–305, 2005.
J Miguel Villas-Boas. A short survey on switching costs and dynamic competition. International
Journal of Research in Marketing, 32(2):219–222, 2015.
Ovunc Yilmaz, Pelin Pekgun, and Mark Ferguson. Would you like to upgrade to a premium room?
evaluating the benefit of o↵ering standby upgrades (july 11, 2016). Manufacturing Service Oper-
ations Management Forthcoming, 2016.
Top Related