MOBILE OFFLOADING FOR ENERGY-EFFICIENT COMPUTATION … · MOBILE OFFLOADING FOR ENERGY-EFFICIENT...
Transcript of MOBILE OFFLOADING FOR ENERGY-EFFICIENT COMPUTATION … · MOBILE OFFLOADING FOR ENERGY-EFFICIENT...
MOBILE OFFLOADING FOR ENERGY-EFFICIENT
COMPUTATION ON SMARTPHONES
BY
LIYAO XIANG
A THESIS SUBMITTED IN CONFORMITY WITH THE REQUIREMENTS
FOR THE DEGREE OF MASTER OF APPLIED SCIENCE,DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING,
AT THE UNIVERSITY OF TORONTO.
COPYRIGHT c© 2015 BY LIYAO XIANG.ALL RIGHTS RESERVED.
Mobile Offloading for Energy-efficient Computation onSmartphones
Master of Applied Science ThesisEdward S. Rogers Sr. Dept. of Electrical and Computer Engineering
University of Toronto
by Liyao Xiang2015
Abstract
Mobile offloading enables mobile devices to distribute computation-intensive tasks to the cloud
or other devices for energy conservation or performance gains. In principle, the idea is to trade
the relatively low communication energy expense for high computation power consumption. In
this thesis, we first focus on the technique of mobile code offloading to the cloud by proposing
the new technique of coalesced offloading, which exploits the potential for multiple applica-
tions to coordinate their offloading requests with the objective of saving additional energy on
mobile devices. We then turn our attention to collaborative mobile computing where a group of
mobile users with the common target job form coalitions to reduce the overall energy costs. We
propose distributed collaboration strategies through game theory, and formulate the problem as
a non-transferable utility coalitional game, and solve it by merge and split rules.
ii
To my families
AcknowledgmentsFirst, I would like to express my gratitude towards my advisor, Professor Baochun Li. With-
out his guidance, support, and patience, my master study could not be smoothly completed.
Throughout the research and thesis writing process, he provided insightful advice on exploring
some interesting and promising visions, as well as sound suggestions on technical writing.
I am also thankful to all dear members in iQua Research Group at the University of Toronto,
who not only offered practical suggestions to my research but also created a stimulating and
delightful environment in which to learn and grow. Special regards to Chen Feng who recently
got the tenure-track faculty position at the University of British Columbia, for his funny jokes
and wise advice.
Last but not least, I would like to give my geniune regards to my parents who love and
support me unconditionally throughout the entire journey. They are the source of my happiness
and power.
iii
Contents
Abstract ii
Acknowledgments iii
List of Tables vii
List of Figures ix
1 Introduction 1
1.1 Mobile Code Offloading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Collaborative Mobile Computing . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Related Work 5
2.1 Mobile Code Offloading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Collaborative Mobile Computing . . . . . . . . . . . . . . . . . . . . . . . . . 6
3 Coalesced Offloading from Mobile Devices to the Cloud 9
3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Motivation and Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . 10
iv
CONTENTS CONTENTS
3.2.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2.2 The Coalesced Offloading Problem . . . . . . . . . . . . . . . . . . . 12
3.3 Coalesced Offloading: an Offline Solution . . . . . . . . . . . . . . . . . . . . 15
3.3.1 From Continuous-Time to Discrete-Time Formulation . . . . . . . . . 15
3.3.2 Optimal Offline Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 19
3.4 Ready, Set, Go: Online Algorithms . . . . . . . . . . . . . . . . . . . . . . . . 21
3.4.1 The Dynamic TCP Acknowledgment Problem . . . . . . . . . . . . . 21
3.4.2 The Online Algorithm Aθ . . . . . . . . . . . . . . . . . . . . . . . . 22
3.4.3 Deterministic Online Algorithm: Performance Analysis . . . . . . . . 23
3.4.4 Performance Analysis of Aθ . . . . . . . . . . . . . . . . . . . . . . . 25
3.5 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.5.1 Measuring the Tail Time . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.5.2 Model-Driven Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.5.3 Experiments on the Mobile Phone . . . . . . . . . . . . . . . . . . . . 34
3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4 Coalition Formation for Collaborative Mobile Computing 38
4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.2 The Collaborative Computing Model . . . . . . . . . . . . . . . . . . . . . . . 40
4.3 Task Distribution for Mobile Applications . . . . . . . . . . . . . . . . . . . . 43
4.4 Coalition Formation among Mobile Users . . . . . . . . . . . . . . . . . . . . 46
4.4.1 Coalitional Game and Properties . . . . . . . . . . . . . . . . . . . . . 47
4.4.2 Coalition Formation Algorithm . . . . . . . . . . . . . . . . . . . . . 48
4.4.3 Stability Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.5 Simulation Results and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 53
v
CONTENTS CONTENTS
4.5.1 Simulation Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.5.2 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5 Conclusion 60
Bibliography 62
vi
List of Tables
3.1 Energy cost reduction compared with the naive strategy. . . . . . . . . . . . . . 35
vii
List of Figures
3.1 The benefits of coalesced offloading. . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 The coalesced offloading problem: an illustrative example. . . . . . . . . . . . 14
3.3 The online algorithm A1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.4 The proof of the competitive ratio of Aθ. . . . . . . . . . . . . . . . . . . . . . 26
3.5 The Proof of Lemma 3.2 (to prove the competitive ratio of Aθ) . . . . . . . . . 28
3.6 (a) The fcost of offloading requests with different levels of fluctuations. (b) The
estimated energy cost with varying α. . . . . . . . . . . . . . . . . . . . . . . 34
3.7 (c) The energy consumption on the mobile device with a varying α. (d) The
request transmissions on the mobile device w/o the RSG algorithm. . . . . . . 35
3.8 (e) The request transmissions on the mobile device w/o the RSG algorithm. (f)
The battery voltage change as measured on the mobile device. The top figure
shows the result of the naive strategy, and the bottom one shows the result of
the RSG algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.1 The Workflow of an Example Job . . . . . . . . . . . . . . . . . . . . . . . . 41
4.2 The Resource Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.3 An example of mapping jobs to a set of mobile devices . . . . . . . . . . . . . 44
viii
LIST OF FIGURES LIST OF FIGURES
4.4 Average energy cost per user when users are non-cooperate, running central-
ized, merge and split algorithms using Pareto order and utilitarian order. (a)
User connecting ratio 0.10. (b) User connecting ratio 0.35. (c) User connect-
ing ratio 0.60. (d) User connecting ratio 0.95. . . . . . . . . . . . . . . . . . . 57
4.5 Average coalition size when users running centralized and merge and split al-
gorithms. (a) User connecting ratio 0.10. (b) User connecting ratio 0.35. (c)
User connecting ratio 0.60. (d) User connecting ratio 0.95. . . . . . . . . . . . 58
4.6 Average running time when users running centralized and merge and split al-
gorithms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
ix
Chapter 1
Introduction
The thin, user-interactive mobile devices and the powerful, remote cloud seem to be the two
ends of the world. At one end, the mobile device boasts the advantage of being mobile and
responsive to user, but generally lack computation power and short of battery; at the other end,
the cloud service requires stable network connection conditions and its computation power
appears to be infinite to the users. To exploit the advantages of the both, it is a valid idea to
offload some computation-intensive parts of the mobile application to execute on the cloud. As
a generalization of this idea, the offloading parts of any granularity can be regarded as a task
to be executed on any other devices. The goal of saving energy can be achieved by trading
relatively low energy consumption of network connection in exchange for high power expense
on intensive computation.
In this thesis, we study two problems: the first one will try to answer the question that
how to save energy when offloading code from one single device to the cloud; the second one
concerns about collaboration among different mobile devices to minimize the overall energy
consumption. In what follows, we will give a brief overview of each problems, and present
how this thesis is organized.
1
1.1. MOBILE CODE OFFLOADING 2
1.1 Mobile Code Offloading
Heralded as a primary feature in mobile cloud computing, code offloading from mobile devices
to the cloud has received a substantial amount of research attention in the recent literature. The
concept of code offloading is intuitively simple: with an abundance of computing power in the
cloud computing infrastructure and a keen awareness of power efficiency on mobile devices, it
is natural to offload a portion of the computational requests within computationally intensive
mobile applications. With its roots dating back to the notion of thin clients in the 1990s, code
offloading may be instrumental in a wide variety of mobile applications, from natural language
processing (e.g., Apple’s Siri) to augmented reality.
Code offloading can be performed at the granularity level of thread execution [1,2], method
invocation [3], and even full VM migration [4]. Either way, offloading requests have been well
planned, with the optimization objective of gaining better application performance and energy
efficiency. To achieve the objective, a typical solution includes a profiler on the mobile device
that collects runtime statistics of the mobile application, as well as a solver that partitions the
computation in a way that optimizes energy consumption or application performance.
However, existing works have so far focused on one application only. In reality, mainstream
mobile operating systems support multitasking, with multiple applications running simultane-
ously on a mobile device. Particularly, there may be several services running in the background
while one or two applications running on screen. A user may ask Siri (or Google Now) about
a location, viewing the augumented reality street view on her phone, while in the background
downloading a cloud-sourced video over 3G or 4G mobile networks at the same time. When
multiple applications send their offloading requests to the cloud independently without any
coordination, the cellular or Wi-Fi network interface needs to be activated to transmit these
requests, entering the high-power state at arbitrary times. This may potentially consume more
1.2. COLLABORATIVE MOBILE COMPUTING 3
energy: once a network interface enters the high-power state, it lingers in this state for a pe-
riod of time, usually seconds, after completing the transmission of all the existing requests [5].
The amount of energy the network interface consumes in the high-power state before it enters
stand-by again, referred to as the tail energy, is proportional to the length of time the interface
stays in this state (referred to as the tail time).
Since during the tail time, the smartphone transmits nothing but only waits for incoming
transmission requests, that portion of energy is wasted when the phone goes to idle state. If
multiple applications send their transmission requests without careful scheduling, much energy
would be wasted at the high-power state. In Chapter 3, we study the tail time phenomenon,
and propose algorithms to reduce such energy waste as much as possible without incurring too
much penalty in latency.
1.2 Collaborative Mobile Computing
As well as the effort of computational offloading to the cloud, the trend for mobile users seek-
ing power outside the devices to assist its computation-intensive tasks is increasingly popular.
Cyber foraging and cloudlet [6] have long been proposed as a way to liberate mobile devices
from severe resource constraints. Apart from migrating computation to the cloud, other fea-
tures such as ‘Continuity’ recently released by Apple has made the technique of cross-platform
application state migration possible, which allows users to smoothly transfer between their Mac
OS X and iOS in close proximity when editing an email or answering a phone call. Moreover,
the recent literature of I/O sharing [7] between mobile systems further facilitate the collabo-
ration between mobile devices by permiting an application running on one mobile system to
access the I/O devices on another device. That clearly points out a trend that more and more
tasks that previously can only be done on one device are designed to be executed distributedly
1.3. THESIS ORGANIZATION 4
on different devices and platforms which are connected with each other. Actually, today many
prototype applications in the area of social sensing, crowdsourcing, content sharing, etc. often
need to recruit many devices to jointly work together.
Moving from the powerful computing infrastructure cloud to nearby computational devices,
the strategy of assigning task is not a binary choice anymore. When multiple users are involved,
the problem of finding an optimal task arrangement in terms of energy consumption is NP-
hard. A possible approach is to let the users to make decisions distributedly to form coalitions.
In Chapter 4, the problem of collaborative mobile computing is studied with focus on task
distribution and coalition formation.
1.3 Thesis Organization
The remainder of this thesis is organized as the following. In Chapter 2, we present the back-
ground and related work to this thesis. Chapter 3 discusses the energy-saving approach to
coalescedly offload computational requests from devices to the cloud. Both simulations and
real-world experiments on iPhone are done in this part. Chapter 4 theorectically proposes a
collaborative mobile computing coalition game, and adopted merge-and-split rules in forming
the coalitions. The efficiency of the algorithm is verified by simulations. In Chapter 5, we
concludes our work and reveal our future works.
Chapter 2
Related Work
In this Chapter, we briefly review the related literature in the field of mobile cloud comput-
ing, illustrate the difference between other work and ours, and finally motivate the following
chapters.
2.1 Mobile Code Offloading
Many existing works in the literature of code offloading between mobile devices and the cloud
only considered the optimal offloading choice of a single application. Works such as [1, 3] de-
cided at runtime which parts of the application are to be remotely executed with an optimization
engine, in order to achieve the best energy savings. Kosta et al. [8] developed a framework of
smartphone virtualization in the cloud, allowing method-level computation offloading. Gordon
et al. [2] used a distributed shared memory technique instead of remote procedure calls to sup-
port multi-threaded applications to run on multiple machines. However, it has been observed
that the on-and-off switching state of the network interface, incurred by offloading requests
from multiple simultaneously running applications, unnecessarily consumes much idle energy.
5
2.2. COLLABORATIVE MOBILE COMPUTING 6
Without considering that aspect in the entire optimization framework, it is insufficient to dis-
cuss code offloading alone.
Our work is also closely related to Balasubramanian et al. [5], as it found that 3G incurs a
high tail energy overhead for lingering in the high-power state after the completion of a transfer.
It also proposed a scheduling algorithm to minimize the energy consumed while meeting user-
specified deadlines. However, the scheme is only designed for delay-tolerant and prefetching
applications, without taking the length of delays into account.
Our online strategies are tied to the online algorithm literature [9–11]. The dynamic TCP
acknowledgment problem is a generalization of the classical ski rental problem with the same
competitive ratio. We show that our problem is a generalization of the dynamic TCP acknowl-
edgment problem, and we prove that the competitive ratio achieves its special case, which is
already proven to be the best possible. A similar case can also be found in scheduling tasks
to minimize the total power consumption [12], and they presented an effort to minimize the
number of “gaps,” e.g., the idle periods, in application execution.
2.2 Collaborative Mobile Computing
Most previous works in the area of application migration or code offloading, such as MAUI [3],
CloneCloud [1], and ThinkAir [8] etc., only consider offloading via the Internet to powerful
servers in remote cloud. The cloud seems to be a perfect solution as the backend of smart-
phones, but the long WAN latency often fail to achieve the stringent delay requirement of
mobile applications. To counter the long latency in WAN, the idea of “Mobile Cloud” was pro-
posed in [13, 14] to take advantage of computing power in proximity to get execution speedup
and energy savings. Compared to offloading to the cloud, “Mobile Cloud” is especially use-
ful when the internet access is expensive or unavailable. Our work is mostly related to this
2.2. COLLABORATIVE MOBILE COMPUTING 7
category in terms of multiple devices collaboration.
However, the background of our work is different from Serendipity [15] and other works
in sharing computing resources: in those works, usually one initiator mobile device ‘borrows’
idle computational resources available on other devices in its environment to accomplish tasks
of certain structure. In this work, the device ‘borrows’ computation power from mobile devices
which have the same job objective and can share the workload with. Works like [13, 16, 17]
discuss collaborative computation performed distributedly on a set of mobile devices, but the
incentive of participants is not taken into account. As a matter of fact, it is very hard to define
fairness in such a situation where users contribute multiple resources (computation power and
network connections), and even harder to devise a proper incentive mechanism to enable ev-
eryone to contribute. In this work, we don’t assume users are selfish in that each one of them
is purely motivated by its own benefit, and switches between coalition groups when it sees
utility improvement. The approach may reflect how users behave in real world but impractical
in taking too many iterations before all users reach an agreement; instead, we adopted a sim-
ple merge-and-split rule for each coalition to practice, in which coalitions can either be merged
into one or split up into any number when at least one user sees strict utility improvement while
not hurting other users’ utilities.
From the coalition formation perspective, our work is related to cooperative media stream-
ing problem [18] and the UAVs’ coalition formation problem in wireless networks [19]. While
in cooperative streaming, each mobile device provides their bandwidth endowments in a coop-
erative manner, in our work both the computation power and connection links contribute to the
resource pool which is shared among users in a coalition. The cooperative streaming game is
a game without transferable utility, while the UAV deployment is a transferable utility game.
The problem in our work is found to be a non-transferable utility(NTU) game where each user
2.2. COLLABORATIVE MOBILE COMPUTING 8
adjusts its strategy according to the energy expense which cannot be transferred across different
devices.
Chapter 3
Coalesced Offloading from Mobile Devices
to the Cloud
3.1 Overview
In this thesis, we propose the concept of coalesced offloading, which seeks to achieve addi-
tional energy savings by exploiting the potential for multiple mobile applications to coordinate
their code offloading requests to the cloud. Coalesced offloading realizes the intuition that, by
sending code offloading requests in “bundles,” the period of time that the network interface
stays in the high-power state can be reduced, thus saving additional energy. Our proposed
technique of coalesced offloading is inspired by timer coalescing, used in the kernel of Mac
OS X 10.9 Mavericks, that improves the energy efficiency by deferring and shifting computa-
tion tasks from multiple applications to the same time interval. To our knowledge, our work
represents the first attempt to improve power efficiency by bundling offloading requests from
multiple applications in a coalesced fashion.
Since bundling offloading requests may incur additional offloading delays, we choose to
9
3.2. MOTIVATION AND PROBLEM FORMULATION 10
formulate the problem of coalesced offloading as a joint optimization problem, with both the
energy cost and the response time considered. The highlight of our original contributions is the
design of two online algorithms, collectively referred to as Ready, Set, Go (RSG), that are de-
signed to solve our optimization problem. As the benchmark for evaluating RSG, we first study
an offline algorithm that computes the optimal solution with a time complexity of O(n), with
the impractical assumption that the exact arrival times of future requests from all the applica-
tions are known a priori. Without any knowledge of upcoming offloading requests beforehand,
our deterministic online algorithm is 2-competitive against the optimal offline algorithm, and
our randomized online algorithm is e/(e− 1)-competitive (1.58-competitive). We analytically
show that both online algorithms achieve the best possible competitive ratios in their respective
cases. Our online algorithms are simple enough to implement: using both simulations and our
real-world implementation on the iOS platform, we show that the RSG online algorithm is able
to realize an additional energy saving of up to 20% for the deterministic case and 27% for the
randomized case with a variety of offloading request patterns.
3.2 Motivation and Problem Formulation
In this section, we first motivate the notion of coalesced offloading, and then formally formulate
the optimization problem of making optimal offloading decisions, considering both the energy
cost and application performance.
3.2.1 Motivation
With current code offloading techniques, if a portion of the application code (e.g., a method
invocation or a thread) is to be offloaded to the cloud, an offloading request will be generated,
3.2. MOTIVATION AND PROBLEM FORMULATION 11
and the cellular or Wi-Fi network interface on the mobile device will be activated, incurring
a small ramp-up energy cost, such as the WiFi association overhead. After the completion of
transmitting each request, the interface will not immediately switch to the low-power state.
Instead, it remains at the high-power state for tens of seconds — an inactive period referred to
as the tail time [5], as shown in Fig. 3.1 (a). If there is another request coming in during the
tail time, the inactivity timer will be reset, and the interface will stay at the high-power state
until the end of the transmission, plus another period of the tail time if there are no further
successive requests. The tail time phenomenon is especially critical with the 3G interface,
which consumes nearly 60% of the total energy consumption [5].
Time
t1
Time
Power State
Power State
(a) Before bundling:
(b) After bundling:
requests of app 1
requests of app 2
t2 t3 t4 t5 t6 t7
t2(t1') t3 t5(t4') t7(t6')
Figure 3.1: The benefits of coalesced offloading.
As an important insight that we explore in this paper, the tail time phenomenon can be
alleviated if we bundle the offloading requests into small batches, and handle them all together.
This reduces the energy consumption as the wireless network interface on the mobile device
is activated for fewer times, and a shorter tail time is incurred. Naturally, a single application
may not have frequent successive requests for code offloading; we focus on the abundant re-
quest bundling opportunities that exist when we consider the offloading requests from multiple
3.2. MOTIVATION AND PROBLEM FORMULATION 12
applications running on the device simultaneously. As the example in Fig. 3.1 (b) shows, three
bundles can be formed when offloading requests for two applications are considered at the same
time, which lead to a reduced period of time for the network interface to stay in the high-power
state, as compared to handling each of them independently without any coordination. Such
request bundling from multiple applications is formally referred to as coalesced offloading in
this paper. It requires all offloading requests to be granted by an OS-level coalesced offloading
framework, possibly with a delay, before application code is actually offloaded to the cloud.
3.2.2 The Coalesced Offloading Problem
While coalesced offloading is able to reduce energy costs, request bundling requires the subse-
quent offloading requests to wait for a period of time for the next batch to be handled, which
results in additional offloading delays and may adversely affect the application performance.
The main challenge of coalesced offloading is balancing the tradeoff between the energy cost
and the application performance. If requests are bundled more aggressively, less energy costs
are incurred as a shorter period of time is spent in the high-power state for code offloading.
However, withholding the offloading requests will inevitably cause longer offloading delays.
On the other hand, sending offloading requests in a more scattered manner can maintain the
high performance of applications, but will incur a longer period of time in the high-power state,
causing more energy to be consumed. To find the “sweet spot” in such an inherent tradeoff be-
tween energy savings and application performance, we formulate the problem of coalesced
offloading as a joint optimization problem, considering both the energy cost and application
performance in the objective function.
We assume that there are M applications, 1, 2, . . . ,m, running on the mobile device, and
each application generates multiple offloading requests during their runtime based on their own
3.2. MOTIVATION AND PROBLEM FORMULATION 13
profiler and solver. Let a1, a2, . . . be the arrival time sequence of the offloading requests across
all the applications, and g1, g2, . . . be the granting time sequence, each element of it represent-
ing one transmission from mobile device to the cloud. Notice that multiple requests can be
bundled and granted in one transmission. The granting time directly determines the transition-
ing time from the low to the high power state. The device transitions from the high to the low
power state only when the network has been inactive for the length of tail time. That is to say,
the subsequent transmission occurs at least tail time after the preceeding transmission. We use
the sequence t1, t2, . . . and s1, s2, . . . to respectively denote such transition time sequence when
the wireless interface enters the high-power state from the low-power state and the inverse. Let
T be the duration of the tail time after the completion of transmission. Since the duration of a
request transmission is a few orders of magnitude shorter than the length of the tail time (in the
order of seconds), we assume that all request transmissions are completed instantaneously.
Fig. 3.2 shows an illustrative example of our model. The offloading requests arrival time
sequence is a1, a2, . . . , a9, and the granting time sequence is g1, g2, . . . , g5. As we can see, two
offloading requests generated at time a1 and a2 are delayed to be transmitted at g1, the arrival
time of the third request a3, and the network interface goes into the high-power state. Since
the high-power state remains for at least T , requests generated at a4 and a5 are transmitted
immediately. The network interface transits to the low-power state after idling for time T , and
enters the high-power state again at the next transmission time g4. In a nutshell, we seek to
formulate the problem to find the optimal solution of the granting time sequence g1, g2, . . . to
determine when the wireless interface of the mobile device should stay at the high-power state
for transmitting offloading requests, such that a combined interest in both the energy cost and
the application performance is optimized.
Since the actual energy cost is nearly linear to the duration that the network interface stays
3.2. MOTIVATION AND PROBLEM FORMULATION 14
Tail Time T
Power State
Time
a3(g1) = t1
High-power State Low-power State
s1a2a1 a5(g3) a4(g2) a6 a7 a8(g4) = t2 a9(g5)
latency(1)
Figure 3.2: The coalesced offloading problem: an illustrative example.
at the high-power state, in our problem formulation, we use that time duration to represent the
energy cost.
Observation 3.1 If a transmission occurs at gi when the network interface is in the high-power
state, the energy cost is gi − gi−1. If gi occurs when the network interface is in the low-power
state, the energy cost can be considered as T .
To be more specific, when the transmission gi occurs when the interface is in the high-
power state, it extends that state for a period of gi − gi−1. If gi occurs during the low-power
state, it contributes the tail time T to the energy cost. If the duration of gi−gi−1 > T , the high-
power state will expire before gi, so that the gi occurs in the low-power state. Thus, the energy
cost for one transmission is min{gi − gi−1, T}. The joint optimization problem of coalesced
offloading can be formulated as follows:
min fcost =∑j
min{gj − gj−1, T}+ α∑j
∑ai s.t.
gj−1≤ai≤gj
(gj − ai), (3.1)
In the objective function, the first term represents the energy cost while the second term denotes
the total latencies as offloading requests are postponed by the coalesced offloading framework.
3.3. COALESCED OFFLOADING: AN OFFLINE SOLUTION 15
α is introduced to combine the two objectives, and to balance the conflicting interests between
minimizing the energy costs and minimizing the total latencies for granting the offloading
requests.
At first glance, our formulated problem is similar to the dynamic TCP acknowledgment
problem [9]. The dynamic TCP acknowledgment problem discusses the scenario when a num-
ber of subsequent messages are to be acknowledged, whether we should acknowledge each
individual message immediately upon receiving it, or acknowledge multiple messages with a
single acknowledgment packet. While we amortize the tail energy by delaying the offloading
requests, the dynamic TCP acknowledgment problem delays the acknowledgments to alleviate
the acknowledgment overhead. That said, the two problems are actually quite different. In our
problem, the tail energy is determined by the amount of time that the mobile device stays at
the high-power state, since the transmission time is negligible comparing to the tail time; while
in the dynamic TCP acknowledgment problem, the acknowledgment overhead mainly depends
on the number of acknowledgements.
3.3 Coalesced Offloading: an Offline Solution
In this section, we start to solve the optimization problem that we have formulated by first
transforming it into a discrete-time optimization problem, and then present an offline solution
based on dynamic programming in this section.
3.3.1 From Continuous-Time to Discrete-Time Formulation
By carefully analyzing problem (4.1), we make the following two important observations.
Observation 3.2 All offloading requests should be transmitted either at their respective arrival
3.3. COALESCED OFFLOADING: AN OFFLINE SOLUTION 16
times or the arrival time of other requests to minimize fcost.
Proof We prove that any transmission of requests that do not satisfy the above conditions
increases the total costs. Suppose two requests arrive sequentially at time 〈a1, a2〉, and we are
to about to schedule their transmission time(s) g with the objective of minimizing fcost. We
have the following three choices: (1) g ∈ (0, a1], (2) g ∈ (a1, a2), 3) g ∈ [a2, inf).
For the first case, the total cost is
fcost1 = (a1 − g) + min{a2 − a1, T}+ T.
Obviously, when g = a1, the value is the minimum:
fmincost1 = min{a2 − a1, T}+ T.
In the second case, there would be an α(t − a1) delay cost for the first request. Adding the
same energy cost as in the first case, the total cost would take the form of
fcost2 = min{a2 − t, T}+ T + α(t− a1), a1 < t < a2.
Similarly, the cost in the third case is
fcost3 = T + α(a2 − a1) + 2α(g − a2),
of which the minimum value is:
fmincost3 = T + α(a2 − a1)
3.3. COALESCED OFFLOADING: AN OFFLINE SOLUTION 17
when g = a2.
From fcost2, we have
• When a2 − t < T,
fcost2 = α(a2 − a1) + (1− α)(a2 − t) + T.
Obviously, if α < 1, fcost2 > fmincost3. If α > 1, then
fcost2 > (t− a1) + (a2 − t) + T
= a2 − a1 + T ≥ fmincost1.
• When a2 − t > T,
fcost2 = α(t− a1) + 2T > 2T ≥ fmincost1.
Therefore, no offloading request should be granted in times other than the arrival times if the
total cost fcost is to be minimized. All requests are either granted at their own arrival times or
the arrival time of other requests.
Since tj indicates the time when the wireless interface enters the high-power state, after
which requests will be granted for transmission as they arrive, tj equals to the arrival time of
one of the requests. Similarly, sj shall be a tail time T after some arrival time of a granted
request. The original problem is equivalent to determining at which request’s arrival the inter-
face should be switched to the high-power state, and from which request’s arrival no further
transmissions should take place. For each request, the scheduling decision becomes whether
3.3. COALESCED OFFLOADING: AN OFFLINE SOLUTION 18
to transmit it immediately or to wait until the next transmission. In this way, we can trans-
form the original problem of deciding when to power on and off the network interface into
making a decision for each request upon its arrival, on whether to send it out immediately or
to delay it. Therefore, we may use 1 to represent transmitting the current request (with or
without previously delayed requests), and use 0 to represent the decision to delay the current
request. We transform the original problem (4.1) into deciding a binary transmission sequence
〈1, 0, 0, . . . , 1, . . .〉 for the successively arriving requests, such that the total cost fcost is mini-
mized.
In a nutshell, if a request is granted immediately, the latency cost is 0, and the energy
cost for transmitting the request arrives at ai is min{ai − gprev, T}, where gprev represents the
preceeding transmission; if the request is delayed, it only incurs a latency cost since it will
not extend the tail time. Whenever the request is withheld from transmitting immediately, the
latency cost is α(gnext − ai). Thus, we have
f icost =
min{ai − gprev, T}, if granted,
α(gnext − ai), if delayed.(3.2)
Let fcost represent the sum of the energy cost and latency cost of transmitting the entire request
sequence. We should
min fcost =n∑i=1
f icost, (3.3)
for 2n combinations of binary transmission sequences according to Eqn. (3.2).
The transformation from the resulted binary transmission sequence into the time sets 〈t1, t2, . . . , tk〉
and 〈s1, s2, . . . , sk〉 is simple: let t1 be the arrival time of the first 1 appearing in the sequence.
Whenever the interface enters high-power state, a timer is set to T . If a request is granted
before the timer counts down to 0, the timer will be reset to T . s1 is the time that the timer
3.3. COALESCED OFFLOADING: AN OFFLINE SOLUTION 19
firstly counts down to 0. Whenever the binary transmission sequence turns to 1 again, we set
the arrival time of the request as t2. In this way, we alternatively determine the time sequence
of entering the high-power state t1, t2, . . . and s1, s2, . . . the time sequence of leaving that state.
3.3.2 Optimal Offline Algorithm
We now present an optimal offline algorithm to solve the problem (4.1), in which the arrival
time sequence a1, a2, . . . , an are given a priori. The objective is to output a binary transmission
sequence Seq[n], such that the total cost fcost is minimized. Though it depends on unrealistic
assumptions of knowing the timing of all future requests, our offline algorithm will serve as
the benchmark for us to design and evaluate our online algorithms.
We use dynamic programming to obtain an optimal offline algorithm with a time complex-
ity of O(n). Let Cmin[i] be the minimum cost of the arrival time subsequence 〈a1, a2, . . . , ai〉
and Seq[i] be the binary transmission sequence for that arrival time subsequence. For an arrival
time sequence of length i, there are 2i possible combinations of binary transmission sequences,
all of which will be traversed to obtain the one with the minimum cost.
With respect to the binary transmission sequence, we state the following facts that will lead
us to the offline algorithm. There are 2i−1 possible binary transmission sequences in total for
the arrival time sequence 〈a1, a2, . . . . . . , ai〉, if the last request in the arrival time sequence
must be transmitted. The cost of the arrival time sequence 〈a1, a2, . . . , ai〉 consists of the sum
of min{ai − ai−1, T} and the cost for 〈a1, a2, . . . , ai−1〉, if the granting time sequence of the
latter is a subsequence of the former. If not, the cost of the arrival time sequence from a1 to
ai is the sum of the cost of the sequence from a1 to ai−j , the latency costs of the requests
from ai−j+1 to ai−1, and the tail time T . We first need to find out the granting time sequence
that minimizes the cost of all the subsequences of 〈a1, a2, . . . , ai〉 to obtain the granting time
3.3. COALESCED OFFLOADING: AN OFFLINE SOLUTION 20
sequence minimizing its total cost. We now give our optimal offline algorithm, as summarized
in Algorithm 3.1.
Algorithm 3.1 The Offline AlgorithmInput: a1, a2, . . . , anOutput: Seq[n]Initialize Cmin[0] = 0Initialize Seq[0] = 〈〉Initialize Cmin[1] = TInitialize Seq[1] = 〈1〉for i ∈ [2, n] doCmin[i] = Cmin[i− 1] + min{ai − ai−1, T}Seq[i] = 〈Seq[i− 1], 1〉for j ∈ [2, i] doC[i] = Cmin[i− j] + α
∑i−1k=i−j+1(ai − ak) + T
if C[i] < Cmin[i] thenCmin[i] = C[i]Seq[i] = 〈Seq[i− j], 01st , . . . , 0j−1th , 1〉
Theorem 3.1 The offline algorithm produces an optimal solution to the transformed coalesced
offloading problem (6).
Proof It is known that if a problem possesses the optimal substructure property, then any
dynamic programming algorithm that explores all subproblems is an optimal algorithm. To
see problem (6) contains the optimal substructure property, we only need to note that if the
sequence Seq[i] optimizes the total cost of input 〈a1, a2, . . . , ai〉, the subsequence of Seq[i] —
Seq[i − j] must be an optimal solution to the subsequence 〈a1, a2, . . . , ai−j〉. Our algorithm
obviously explores all the possible subproblems to obtain the optimal solution.
3.4. READY, SET, GO: ONLINE ALGORITHMS 21
3.4 Ready, Set, Go: Online Algorithms
We are now ready to design Ready, Set, Go (RSG), our online algorithms to solve problem (4.1)
without a priori knowledge of the arrival time sequence. We begin by considering the algo-
rithms that probabilistically vary the amount of latency with a similar approach to the dynamic
TCP acknowledgment problem [9]. We show that the coalesced offloading problem is a gen-
eralized case of this problem, which is known to be a generalization of the online ski rental
problem.
3.4.1 The Dynamic TCP Acknowledgment Problem
The dynamic TCP acknowledgment problem is a generalization of the online ski rental problem
with the following form. The input is a sequence of the packet arrival times a1, a2, . . . , an and
the output is a set of times t1, t2, . . . , tk at which an acknowledgment occurs. The latency is
defined as the amount of time elapsed between a packet arrives and it is acknowledged. The
cost of each acknowledgment is 1. The problem objective is to minimize
k +∑
1≤j≤k
latency(j).
It has been proved by Karlin et al. that the randomized algorithm of this problem has an optimal
competitive ratio of e/(e− 1).
We can see that the coalesced offloading problem is a generalized case for the dynamic
TCP acknowledgment problem with the following analysis.
While the cost for each acknowledgment in the dynamic TCP acknowledgment problem
is a constant, its counterpart in our problem, i.e., the energy consumption of each transmis-
sion, is a function that depends on the previous transmission time. To show the dynamic TCP
3.4. READY, SET, GO: ONLINE ALGORITHMS 22
acknowledgment problem is a special case of our problem, we only need to set T such that
T < (gi − gi−1). Then the energy cost for each transmission is a constant T . If we set both
T and α to 1, then the energy cost is exactly the acknowledgment cost of the TCP problem,
whereas the latency costs in both problems are equivalent.
3.4.2 The Online Algorithm Aθ
Our algorithm Aθ is defined as follows.
Definition 3.1 Aθ is a randomized algorithm that selects θ between 0 and 1 according to a
probability density function p(θ) = eθ/(e − 1). Let R(t, t′) be the number of requests that
arrive between time t and t′, and g1, g2, . . . , gi, . . . be the times at which requests are granted
and transmitted. Algorithm Aθ grants the next request at gi+1 such that there exists a time τi+1,
gi < τi+1 < gi+1, that satisfies
R(gi, τi+1)(gi+1 − τi+1) = (θ/α)Si, (3.4)
where
Si = 2(min{τi+1 − gi, T}+min{gi+1 − τi+1, T})−min{gi+1 − gi, T}.
The intuition behind the equation above is simple: given the previous transmission occur-
ring at time gi, the additional transmission happens at τi+1 will reduce the latency cost by θSi.
Si is essentially the amount of energy cost increment due to the additional transmission. It is
easy to prove that
min{τi+1 − gi, T}+min{gi+1 − τi+1, T} ≥ min{gi+1 − gi, T},
3.4. READY, SET, GO: ONLINE ALGORITHMS 23
so that
Si ≥ min{gi+1 − gi, T} ≥ min{gi+1 − τi+1, T}. (3.5)
Fig. 3.3 helps to explain our algorithms and proofs. The x-axis represents the time, and
the y-axis represents the number of request arrivals. The staircase function indicates the arrival
sequence of the requests. gi defines the times at which a bundle of requests is granted. The
shaded area below the staircase curve and the dotted line above represents the saved latency
cost. Fig. 3.3 shows an example of the online algorithm A1, letting τ0 = g0 = 0.
!"
Time
Request
Arrivals
!# !$!$!#!"
s1α
s0α
s2α
Figure 3.3: The online algorithm A1.
3.4.3 Deterministic Online Algorithm: Performance Analysis
We prove that when θ = 1, the deterministic online algorithm A1 is 2-competitive against the
optimal algorithm AOPT.
3.4. READY, SET, GO: ONLINE ALGORITHMS 24
Lemma 3.1 The optimal algorithm grants a request between any pair of successive transmis-
sions.
Proof We suppose that A1 grants requests at times g1, g2, . . . , gi, . . .. Enrich the sequence by
adding transmissions at time τi+1, gi < τi+1 < gi+1 for all i such that Eqn. (3.4) is satisfied.
It is obvious to see from Fig. 3.3 that, by adding a transmission at time τi+1, the latency cost
decreases at least by min{(gi+1 − τi+1), T} units between gi and gi+1, whereas the additional
energy consumption incurred is at most min{(gi+1 − τi+1), T} units. In this case, the new
sequence is at least as good as the original one. It is easy to see the reduced amount of latency
cost by Eqn. (3.5). To see the additional energy cost incurred is at most min{(gi+1− τi+1), T},
we recall that by performing an additional transmission at τi+1, the energy cost is increased by
min{τi+1−gi, T}whereas the original energy cost of the transmission at gi+1 min{gi+1−gi, T}
is now updated to min{gi+1− τi+1, T} for one additional transmission. The net increase of the
energy cost satisfies
min{(τi+1 − gi), T} −min{gi+1 − gi, T}
+min{gi+1 − τi+1, T}
≤ min{(gi+1 − τi+1), T}
since τi+1 < gi+1. Hence, there exists an optimal sequence that grants requests as least once in
the interval (gi, gi+1) for all i.
Theorem 3.2 Algorithm A1 is 2-competitive.
Proof Let the input be any arbitrary request arrival sequence. As Lemma 3.1 has shown, the
optimal algorithm grants at least one request between any pair of successive transmissions. In
Fig. 3.4, we use a ray instead of a staircase curve to simplify the presentation of the arrival
3.4. READY, SET, GO: ONLINE ALGORITHMS 25
times of the requests. The solid line and dotted line stand for the transmission sequences of the
optimal algorithm AOPT and A1. The shaded area below AOPT and above A1 is L(A1\AOPT),
which is the latency cost incurred by A1 but not AOPT. The area is bounded by the granting
time of the optimal sequence at the left and the granting time of A1 at the right. Thus, by the
definition of A1, we have
L(A1\AOPT) ≤ ( 1/α)∑
iSi.
Then the total cost of A1 can be calculated as follows:
CA1 = αL(A1) + E(A1)
≤ CAOPT + αL(A1\AOPT)− E(AOPT) + E(A1)
≤ CAOPT + α× (1/α)∑i
Si −∑i
(min{gi+1 − τi+1, T}+
min{τi+1 − gi}) +∑i
min{gi+1 − gi, T}
= CAOPT +∑i
(min{gi+1 − τi+1, T}+min{τi+1 − gi})
= CAOPT + E(AOPT)
≤ 2COPT.
3.4.4 Performance Analysis of Aθ
Theorem 3.3 The competitive ratio between the expected cost incurred by Aθ and the optimal
cost is e/(e− 1).
Proof We start our proof by first decomposing the total cost of Aθ. As Fig. 3.4 has illustrated,
L(Aθ\AOPT) is the latency incurred by Aθ but not AOPT, which is the dark shaded area above
3.4. READY, SET, GO: ONLINE ALGORITHMS 26
Request
arrivals
Time
AOPT
AθRequest arrivals
L(Aθ\AOPT)
L(AOPT\ Aθ)
Figure 3.4: The proof of the competitive ratio of Aθ.
the dotted line and below the solid line. Likewise, L(AOPT\Aθ) stands for the latency incurred
by AOPT but not Aθ, which is illustrated by the light shaded area above the solid line and below
the lighted line. The latency cost of Aθ is the area above the curve of Aθ and below the curve
of request arrivals, which is at most the area above the solid curve plus the dark shaded area
minus the light shaded area. Thus, the total cost satisfies
CAθ ≤ Eθ + (COPT − EOPT)
+ [L(Aθ\AOPT)− L(AOPT\Aθ)]× α,
letting Eθ and EOPT be the energy cost of Aθ and AOPT, respectively. By the definition of Aθ,
the sum of the dark shaded area is:
L(Aθ\AOPT) ≤ (θ/α)∑i
Si
= (θ/α)(2EOPT − Eθ).
3.4. READY, SET, GO: ONLINE ALGORITHMS 27
For the light shaded area, we will prove the following lemma first.
Lemma 3.2 The light shaded area L(AOPT\Aθ) satisfies:
αL(AOPT\Aθ) ≥∫ 1
0
E(x)dx− (1− 2θ)EOPT − θEθ. (3.6)
To prove the lemma above, we make the following claim. Let M(E, θ) be the minimum,
over all possible granting sequences W with the energy cost E, of the area above W and below
the Aθ curve as shown in Fig. 3.5. We omit the request arrivals and use a line to represent Aθ.
We claim that for any u > v ≥ θ,
M(Eu, θ) ≥ [(v − θ)/α](Ev − Eu) +M(Ev, θ) (3.7)
Proof Let nu and nv represent the total number of grants incurred by performing algorithmsAu
andAv for the same input. The granting sequence is h1, h2, . . . , hnu forAu and is g1, g2, . . . , gnv
for Av. As shown in Fig. 3.5, the shaded rectangles of Av, defined by the definition of Av,
intersect with the Au curve at most nu times. Therefore, at least nv − nu shaded rectangles
strictly lie above the curve of Au. Pick exactly nv − nu of them, denote each one of them by
its transmission sequence number i, and define the set of these rectangles as V ∗. Let
S(V ∗) =∑i∈V ∗
Si.
Then the sum of the area of the nv − nu rectangles in V ∗ is (v/α)S(V ∗), and the area of
(v/α)S(V ∗) that lies above the curve of Aθ is at most (θ/α)S(V ∗). Thus the shaded area
3.4. READY, SET, GO: ONLINE ALGORITHMS 28
Time
Au
Av
Area above Aθ
Request Arrivals
Rectangle intersecting Au
Rectangle lying
above Au
!! "! #! !$ "$ #%
Aθ
Figure 3.5: The Proof of Lemma 3.2 (to prove the competitive ratio of Aθ)
3.4. READY, SET, GO: ONLINE ALGORITHMS 29
below the Aθ curve is at least (v−θ)αS(V ∗), and this area strictly lie above the curve of Au. We
next generate a new granting sequence g∗1, g∗2, . . . , g
∗n with A∗v such that the energy cost of it
is exactly the same with the energy cost of the transmission sequence of Av. Also, the new
generated sequence issues a grant at τi, ∀ i ∈ V ∗. Thus, the shaded area strictly lies below the
curve of A∗v but above the curve of Au in Fig. 3.5, which is
M(Eu, θ)−M(Ev, θ) ≥ [(v − θ)/α]S(V ∗). (3.8)
Note that Ev in Eqn. (3.8) is the energy cost of the new granting sequence of A∗v, which is
equivalent to the energy cost of Av. It is still required to prove the following to have Eqn. (3.7).
S(V ∗) ≥ (Ev − Eu). (3.9)
By Eqn. (3.5), we have verified
S(V ∗) ≥∑i∈V ∗
min{gi+1 − gi, T}.
For the same input sequence, the entire time duration of the transmission sequences of Au and
Av is the same:nu−1∑i=0
(hi+1 − hi) =nv−1∑i=0
(gi+1 − gi),
Removing the nv−nu items in V ∗ from the right-hand side and applying the minimum function
to both sides, we then get
nu−1∑i=0
min{hi+1 − hi, T} ≥∑j /∈V ∗
min{gj+1 − gj, T},
3.4. READY, SET, GO: ONLINE ALGORITHMS 30
Therefore,
S(V ∗) ≥nv−1∑i=0
min{gi+1 − gi, T} −∑j /∈V ∗
min{gj+1 − gj, T}
≥nv−1∑i=0
min{gi+1 − gi, T} −nu−1∑i=0
min{hi+1 − hi, T}
= (Ev − Eu).
Combining with Eqn. (3.8) gives us Eqn. (3.7).
Letting u = v + dv, the equation (13) can be rewritten as
M(Ev+dv, θ) ≥v − 2θ
α(Ev − Ev+dv) +M(Ev, θ).
Integrating from θ to t, for any θ < t ≤ 1, we have
∫ t
θ
dM(Ev, θ) ≥∫ t
θ
−v − θα
dEv +θ
α
∫ t
θ
dEv
which is equivalent to
M(Et, θ)−M(Eθ, θ) ≥1
α[
∫ t
θ
Evdv − (t− θ)Et] +θ
α(Et − Eθ).
M(Eθ, θ) = 0 according to its definition. And for v > t, Ev ≤ Et, we have
M(Et, θ) ≥1
α[
∫ 1
θ
Evdθ − (1− 2θ)Et − θEθ].
Let Et = EOPT , and recall that M(EOPT , θ) is the lower bound for L(AOPT\Aθ). Thus,
Lemma 3.2 is proved.
3.5. PERFORMANCE EVALUATION 31
We hereby can prove Theorem 3.3. By the definition of Aθ, θ is picked from 0 to 1 accord-
ing to a probability density function p(θ) = eθ
e−1 , and∫ x0p(θ)dθ = P (x).
CAθ ≤ COPT − EOPT +
∫ 1
0
p(θ)[Eθ+
α(L(Aθ\AOPT )− L(AOPT\Aθ))]dθ
≤ COPT − EOPT +
∫ 1
0
p(θ)[Eθ + 2θEOPT − θEθ
−∫ 1
0
Exdx+ (1− 2θ)EOPT + θEθ]dθ
= COPT − EOPT +
∫ 1
0
p(θ)[Eθ −∫ 1
0
Exdx+ EOPT ]dθ
= COPT +
∫ 1
0
p(θ)Eθdθ −∫ 1
0
p(θ)
∫ 1
0
Exdxdθ
by changing the order of integration
= COPT +
∫ 1
0
p(θ)Eθdθ −∫ 1
0
ExP (x)dv
= COPT +
∫ 1
0
(p(θ)− P (θ))Eθdθ.
At last, we have:
CAθCOPT
≤ 1 +
∫ 1
0(p(θ)− P (θ))Eθdθ∫ 1
0Eθdθ
=e
e− 1.
3.5 Performance Evaluation
We evaluate both offline and online algorithms using both model-driven simulations and real-
world experiments on a mobile device. We start with measuring the tail time in our model
3.5. PERFORMANCE EVALUATION 32
to evaluate the cost performance of both offline and online algorithms. Then we quantify the
reduction of energy utilizations performing the RSG algorithms with real-world runtime traces
from mobile applications.
We run all of our real-world experiments on an iPhone 3GS with iOS 6.1.3, and using the
Bell Mobility 3G cellular network. To measure the energy consumption, we use PowerGrem-
lin [20], a power usage monitor application, to record run-time battery capacity (mAh) with a
sample duration of one second. All of our measurements are performed under stable network
conditions, with the mobile device running in a standalone environment in which all other ap-
plications and background tasks are shut off except for our application-level prototype service,
and with the screen off.
3.5.1 Measuring the Tail Time
The measurement methodology of the tail time is as follows. Initially we plan to use Power-
Gremlin to track the energy trace of sending a packet, however, this method ends up with a
very subtle energy change that is difficult to detect. We decided that the tail time is to be mea-
sured by transmitting successive packets of equal sizes, given that the time intervals between
every two transmissions are the same. Our argument is that when the time interval is smaller
than tail time T, the 3G network interface is kept on from the previous transmission to the next,
thus the variation of the transmission interval will make no difference to the overall energy
consumption. On the contrary, if the time interval between transmissions is longer than the tail
time T, the 3G network interface enters stand-by some time T after the completion of the last
transmission, thus the overall energy consumption is reduced.
To measure the 3G tail time, we generate stable sequential offloading requests over a period
of 5 minutes. To eliminate the effect of varying transmission costs incurred by different sizes
3.5. PERFORMANCE EVALUATION 33
of the packets, we set the packet to be of equal sizes, and small enough to avoid a heavy
transmission overhead. In our experiments, the time intervals between requests are designated
to span from 3 to 17 seconds. Our measurement result is in accordance with our argument:
the total energy consumed during the 5-min period keeps the same level when the transmission
time interval varies from 3 to 9 seconds, but drops dramatically at 9 seconds. As a result, 9
seconds is the tail time for the 3G interface in iOS 6. We hereby use the result in our subsequent
simulations and experiments.
3.5.2 Model-Driven Evaluation
In our model-driven evaluations, the trace of offloading requests is a sequence of the arrival
times 〈a1, a2, . . . , ai, . . .〉, simulating the timing of multiple offloading requests from several
simultaneously running applications. We categorize the request patterns into three types: low,
medium, and high fluctuation. With these request sequences as input, we compare the total
cost fcost of the online RSG algorithms with the benchmark of the offline algorithm. Each
simulation result is averaged over 500 rounds of tests.
As Fig. 3.6-(a) shows, on average, the cost of the randomized online RSG is no more
than 1.4009 times of the benchmark offline algorithm, while the ratio of the cost of the de-
terministic online RSG to the offline algorithm is 1.4652. Both numbers are within the 1.58
and 2-competitive ratio as analyzed previously. In addition, the randomized online algorithm
generally achieves better performance in terms of fcost when the fluctuation is higher, while the
performance of the deterministic online algorithm almost remains the same for different inputs.
Fig. 3.6-(b) compares the energy costs of the naive, deterministic online, randomized on-
line, and offline algorithms when the weight factor α varies. The naive strategy is to send the
offloading request upon its arrival. As stated previously, the energy cost is proportional to the
3.5. PERFORMANCE EVALUATION 34
time that the network interface stays in the high-power state, which can be used to estimate
the energy costs. As Fig. 3.6-(b) illustrates, the naive strategy incurs the highest energy costs
over all the αs. The performance of the offline algorithm dominates when α is less than 1, but
approaches the curve of the naive case when α increases. This conforms with our observation
that the offline algorithm grants more requests immediately upon arrival when more weights
are added to the latency. The curves for the two RSG online algorithms lie between the naive
and the offline ones, and their energy cost curves climb less dramatically than the offline curve.
low medium high250
300
350
400
450
500
550
600
To
tal C
ost
Randomized
Deterministic
Offline
0.2 0.4 0.6 0.8 1 1.2 1.4 1.650
100
150
200
250
300
350
α
En
erg
y C
ost
Naive
Deterministic
Randomized
Offline
(a) (b)
Figure 3.6: (a) The fcost of offloading requests with different levels of fluctuations. (b) Theestimated energy cost with varying α.
3.5.3 Experiments on the Mobile Phone
In our real-world experiments, we choose XML-RPC [21] to emulate successive offloading
requests generated from multiple applications and their transfers to the cloud. We use three
typical types of offloading requests — random, bursty, and stable — to represent real-world
traces. Each measurement result is averaged over 50 trials, with each trial containing around
50 transfer requests. Without loss of generality, the parameter α is set to be 0.3. From our
3.5. PERFORMANCE EVALUATION 35
Table 3.1: Energy cost reduction compared with the naive strategy.
α Offline Randomized Deterministic0.3 62.3% 28.2% 14.84%1.0 36.23% 14.33% 8.49%1.3 34.09 % 13.86% 8.61%
experimental results, as shown in Fig. 3.7-(c), we have observed that the RSG deterministic
online, randomized online, and offline algorithms can respectively achieve 20.23%, 27.10%
and 60.20% of energy reduction on average, for all three types of requests, compared to the
naive strategy.
Random Bursty Stable0.02
0.03
0.04
0.05
0.06
0.07
0.08
Un
it T
ime
En
erg
y C
ost
(mA
h)
Naive
Deterministic
Randomized
Offline
0.2 0.4 0.6 0.8 1 1.2 1.4 1.60.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
α
Un
it T
ime
En
erg
y C
ost
(mA
h)
Naive
Offline
Randomized
Deterministic
(c) (d)
Figure 3.7: (c) The energy consumption on the mobile device with a varying α. (d) The requesttransmissions on the mobile device w/o the RSG algorithm.
When we look into that how the energy costs vary with α, we find that when α is smaller,
RSG bundles requests in a more aggressive manner so that more energy is saved. Fig. 3.7-(d)
and Table 3.1 together illustrate the energy cost reduction of different algorithms compared to
the naive strategy with varying α, with random requests only.
We take a step further to test our algorithms using real-world traces. To get the trace, we
run three typical mobile applications, Rubik Solver, Email, and online chatting, that are ready
3.5. PERFORMANCE EVALUATION 36
to offload on our iPhone 3GS, and leverage Wireshark [22] to record their network traffic.
The Rubik Solver demonstrates highly bursty traffic for it is computation-intensive, whereas
Email regularly checks with the server in the background, and online chatting arbitrarily gen-
erates traffic from time to time. Fig. 3.8-(e) shows the actual transmission times before and
after scheduling. Apparently, with the RSG algorithm, requests from multiple applications
are transmitted in bundles. To further verify our results, we monitor the raw battery voltage
variation on the mobile device. As Fig. 3.8-(f) shows, the battery voltage are more stable and
decrease more moderately with RSG. Our experiments have revealed that by performing the
RSG algorithm with our real-world traces, the energy consumption is reduced by 20.71%.
Naive
Online
Time (s)
Re
qu
est
Gra
nts
Rubik Solver Email Chat
50 100 150 200 2504040
4060
4080
4100
4120
4140
Ra
w B
att
ery
Vo
lta
ge
(m
V)
50 100 150 200 2504040
4060
4080
4100
4120
4140
Time (s)
Ra
w B
att
ery
Vo
lta
ge
(m
V)
(e) (f)
Figure 3.8: (e) The request transmissions on the mobile device w/o the RSG algorithm. (f) Thebattery voltage change as measured on the mobile device. The top figure shows the result ofthe naive strategy, and the bottom one shows the result of the RSG algorithm.
3.6. SUMMARY 37
3.6 Summary
Coordinating the offloading requests of multiple applications to achieve greater energy savings
while maintaining satisfactory performance is an important issue in offloading from mobile
devices to the cloud. In particular, how can we schedule the offloading requests without any
knowledge of the future requests? To answer that, we propose RSG, which consists of two
online algorithms, one deterministic and one randomized, that dynamically decide when to
grant requests without future information. We prove that the RSG online algorithm achieves
the best possible 2-competitive ratio for the deterministic case and e/(e−1) for the randomized
one. With RSG, our real-world implementation on the iOS platform has shown a substantial
amount of energy savings.
Chapter 4
Coalition Formation for Collaborative
Mobile Computing
4.1 Overview
In this chapter, we study the coalition formation problem for a group of mobile users working
on one job. There are several cases that motivate such collaborations. Mobile devices which
are constrained by their own batteries may need to look for the help of other devices. A typ-
ical example is that a mobile device is going to drain out the battery before finishing the job,
thus it seeks collaborators to share the job with to bring down the energy costs. Or the mobile
device is limited by its capability or physical location to perform the job. For example, some
crowdsourcing task may require data gathered from sensors/cameras on smartphones in certain
area. The collaborative computing model also fits for many other applications in crowdsourc-
ing, content sharing, indoor localization, etc. A particular scenario is that when users enter a
venue, they are likely to use the same shopping-aid or navigation application while they are in
location proximity with each other. Crowdsourcing for mapping an indoor floor plan is such
38
4.1. OVERVIEW 39
a job that can be shared among a group of proximate users. Since collaboration with other
devices can potentially mitigate the overall costs, each mobile device would naturally seek to
form coalition to share their computation and network resources with. The underlying princi-
ple is similar to that of mobile code offloading to the cloud – each mobile user trades relative
low communication energy expense with high computation cost so that the overall energy cost
is reduced. We leave the security and privacy problems out for other literatures to contribute,
and focus on the collaboration issues.
Some general questions to ask in the above scenarios are: among a set of users who are
interested in collaboratively performing a job, how to distribute the tasks to each so that the
overall energy cost is minimized? How do coalitions form among users hosting different re-
sources? In this work, the term of user and device refer to the same thing. We assume that each
user only concerns computation and network connection energy consumption in that these are
accounted as major energy cost factors for a smartphone; and they are pertinent to task distribu-
tion. Even executing the same task would incur different energy costs across devices. Choosing
which collaborators to work with also has significant impact on the energy cost, since mobile
devices communicate with each other via various wireless channels with different energy char-
acteristics. As pointed out in [23], energy consumption of transmitting certain amount of data
is inversely proportional to the available bandwidth condition. If we describe the computation
capability and network channels of each mobile device using a resource graph, the first problem
can be formulated as an optimization problem over all partitions of the graph. However, seek-
ing a centralized solution to minimize the energy cost is very tricky and impractical; further,
a central arbitrator hardly exists in real world. To tackle the problem, we made the following
efforts:
4.2. THE COLLABORATIVE COMPUTING MODEL 40
• Formulating the task distribution problem as a 0-1 integer quadratic programming prob-
lem with quadratic constraints to minimize the overall energy costs for completing a job
on a group of mobile devices.
• Proposing a distributed algorithm for coalition formation through merge-and-split rules.
Using the proposed algorithm, multiple mobile users can self-organize into disjoint in-
dependent coalitions.
The following sections are organized as follows. The underlying infrastructure is intro-
duced and the system model is described in Sec. 4.2. Sec. 4.3 formulates the problem as a
centralized task distribution problem. Sec. 4.4 takes another look at the problem and recon-
sider it as a coalitional game among users. A simple distributedly-running merge-and-split
algorithm is proposed for forming coalitions. The algorithm performance is evaluated against
non-cooperative, other cooperative, and centralized schemes in the following section. The last
section briefly summaries the work.
4.2 The Collaborative Computing Model
In this section, we describe the task distribution problem for a group of mobile users. A job is
modelled by its workflow as such in Fig. 4.1. Divided by functionality, each block stands for an
atomic task which can only be executed on one device. The arrow between blocks represents
the data flow from one task to the other. The task that an arrow pointed at requires the output of
the task before it as its input; otherwise, the task cannot be completed. For example, the outputs
of “Image Capturing” are the inputs of “Data Backup” and “Features Extraction”. However,
the tasks with I/O relations described above can be executed on different devices or on the same
device; in the former case, the link is considered as an external link. Of all the state variables
4.2. THE COLLABORATIVE COMPUTING MODEL 41
that affect the power consumption of a phone, CPU utilization and network connecions are the
two most significant attributes, which are measured by computation cycles and the amount of
data transmitted or received respectively.
We express the above job requirement using a directed graph Gt = (V,E). Each node
i ∈ V is a task associated with computation cycles ci. Each link ei,j ∈ E between nodes i
and j is associated with di,j the amount of data transferred from task i to task j. All computa-
tional cycles of each task and the amount of data to be transferred between them are profiled
beforehand.
Image Capturing15M cycles
Image Capturing20M cycles
Data Backup10M cycles
Features Extraction50M cycles
Find Match100M cycles
350 KB
2 MB
350 K
B
2 M
B
180
KB
500 KB
Figure 4.1: The Workflow of an Example Job
Due to disparity in hardware and operating systems, different mobile devices have varied
energy consumption over the same computation cycles. Since these mobile devices are dis-
tributed in different locations, they connect with each other in various ways as Fig. 4.2 shown:
they can establish Bluetooth or WiFi adhoc connections when in proximity, or via 3G or LTE
networks and so on. The energy costs of different wireless channels vary as well. To describe
the computational and network resources of the potentially collaborative mobile users, we use
the directed resource graph Gr = (N,L), in which each node n ∈ N stands for a device, and it
4.2. THE COLLABORATIVE COMPUTING MODEL 42
takes an Joule to execute one unit computational cycle. Each link ln,m ∈ L of the graph repre-
sents the communication channel from user n to user m, each being associated with bn,m Joule
per KB transferred data from device n to m. The energy costs with regard to computational
cycles and data transfer are profiled on each device a prior.
Internet
WiFi AP
Bluetooth
Figure 4.2: The Resource Graph
From a holistic view, the goal is to minimize the overall energy consumption of all users ac-
complishing the job. We assume that all users involved have the common goal of finishing the
job. Although each one of them can do the job individually without incurring any communi-
cation cost, they may also collaborate with as many other users as possible as long as the tasks
can be divided up. In the latter case, the user can potentially reduce the computation burden
at the cost of increasing communication energy consumption. Between these two extremes,
there should be a sweet spot which minimizes the overall energy costs. To achieve that, we are
interested in partitioning the resource graph Gr into subgraphs of which users form coalitions,
4.3. TASK DISTRIBUTION FOR MOBILE APPLICATIONS 43
and then map the task graph Gt onto each resource subgraph. The steps of partitioning the re-
source graph and mapping tasks are correlated with each other in minimizing the total energy
consumption.
In the next section, we will illustrate the centralized approach to find the optimal partition
and distribution is highly impractical. We thus consider deriving a distributed solution enabling
each user to choose the set of collaborators.
4.3 Task Distribution for Mobile Applications
Given that the resources and job workflow modelled as graphs, there will be multiple ways to
partition the users to different coalitions, and for each coalition, there will be different ways to
assign tasks to each user. Our objective is to find out the optimal way to partition the resource
graph and map the job workflow graph onto it while the energy consumed overall is minimized.
Besides the energy costs, other objectives can be defined within the problem structure as
well. For example, we can also consider the processing time of the job as the goal of task
assignment if the job is delay-stringent. In our problem, we seek a centralized solution that
minimizes the overall energy consumption of executing the job. Let B be the set of all partitions
of graph Gr, and T be one user coalition of the partition P ∈ B. Our objective is:
minP∈B
∑T∈P
minC(T ). (4.1)
C(T ) is the energy consumption of the coalition T . It is the sum of the energy expense of
all mobile devices in that coalition. Given one partition of the graph, we map the job onto the
subgraph in an optimal way that minimizes the energy consumption of the coalition. Fig. 4.3
shows a toy example. Mobile devices n1 to n5 form two coalitions T1, T2 which execute the
4.3. TASK DISTRIBUTION FOR MOBILE APPLICATIONS 44
job respectively, and the job consists of three tasks i1, i2, i3.
n1
n2
n3 n5
n4
i1
i2
i3
i3
i1
i2 i2
i1
i3
Coalition T1
Coalition T2
Figure 4.3: An example of mapping jobs to a set of mobile devices
In addition to the notations in section 4.2, we define the following notations to describe the
energy consumed executing tasks and in communications.
• Let ln,m = 1 represent that device n and m are connected on Gr, and 0 otherwise.
• Let ei,j = 1 represent that the task i is connected with task j on Gt, and the output of task
i is the input of task j. ei,j = 0 represents that the task i, j are not directly associated.
• Let si,n = 1 represent that task i is assigned to device n, and 0 otherwise.
• Let ri,n = 1 represent that task i is able to be executed on device n, and 0 otherwise.
For a given partition P and a group of users T , we optimize the energy expense over the
coalition under the topology constraints of the resource and job workflow graphs. Energy
4.3. TASK DISTRIBUTION FOR MOBILE APPLICATIONS 45
consumption minimization over the coalition can be formulated as below.
minC(T ) =∑n∈T
φn(T ), (4.2)
of which φn(T ) represents the energy consumption on device ∀ n ∈ T :
φn(T ) =
∞, if |T | > |V|,
En(T ), otherwise.(4.3)
En(T ) = an∑i∈V
si,nci+
∑i∈V
∑m 6= n,
m, n ∈ T
i 6= j,
i, j ∈ V
si,nsj,m(ei,jbn,mdi,j + ej,ibm,ndj,i) (4.4)
We consider it infeasible to divide atomic task thus setting a barrier when there are more
than |V| users in a coalition. The first part of (4.4) is linear representing the computational
energy costs, while the second part is quadratic w.r.t si,n denoting the network energy expense
on all external links of the device. The constraints are as follows:
∑n∈T
si,n = 1, ∀ i ∈ V, (4.5)
∑i∈V
si,n ≥ 1, ∀ n ∈ T, (4.6)
si,nsj,mei,j ≤ ln,m, ∀ i 6= j, ∀ n 6= m,n,m ∈ T (4.7)
si,n ≤ ri,n, ∀ i, ∀ n ∈ T, (4.8)
4.4. COALITION FORMATION AMONG MOBILE USERS 46
si,n ∈ {0, 1}, ∀ n ∈ T, ∀ i ∈ V. (4.9)
The given parameters in the above are an, ci, ln,m, bn,m, ei,j , di,j , and ri,n. Given coalition T ,
the optimization problem in (4.2) is to find the optimal si,n to minimize the energy consumption
of a coalition. Constraint (4.5) means each task i is supposed to be executed on one device
only. And constraint (4.6) presents the requirement that each device should be involved with
at least one task. (4.7) shows that device n and m can collaborate with each other only when
the external link between them exists. (4.8) expresses the constraint on the availability of the
resource on device when performing certain task. The problem of (4.2) is quadratic and non-
convex, and the constraints of (4.7) can be non-convex as well, thus there is no existing solver
to solve this integer programming problem. However, since the solution space is not huge
(2|V|∗|T | ≤ 2|V|2), a heuristic approach can be used to find the optimal solution.
4.4 Coalition Formation among Mobile Users
At the first glance, we seek a centralized solution that minimizes the energy consumption of all
users. In the centralized approach, we assume there is an arbitrator who has the energy profiles
of all participating users. It makes decisions w.r.t. the coalition structures, and within each
coalition assigns tasks to each user. However, it is shown in [24] that a problem as (4.1) is a
NP-complete problem. This is mainly because the number of possible partitions of graph Gr
grows exponentially with the number of users. Moreover, finding the optimal way of assigning
tasks to each group adds complexity to the problem. In real world, an arbitrator hardly exists
to assist with the decision making process. Thus we proposed a distributed solution enabling
each user to make local decisions to join or split from coalitions depending on its preference.
4.4. COALITION FORMATION AMONG MOBILE USERS 47
4.4.1 Coalitional Game and Properties
Coalitional game is a game where groups of players may enforce cooperative behavior, hence
the game is a competition between coalitions of players. In the area of coalitional game, coali-
tion formation has been a research topic of continuing interest, and a set of analytical tools
have been developed [25].
In the settings of collaborative computing, the proposed problem is modeled as a (N, v)
coalitional game where N is the set of players (or users), and v is the utility function or value
of a coalition, which in our case corresponds to the energy cost C(T ) for users coalition T .
Notice that if the coalition size is larger than the number of tasks of the job, i.e., |V| in the task
graph Gt, the job cannot be divided among the users in coalition, thus the corresponding utility
is defined as negative infinity. Otherwise, the utility v(T ) should be a decreasing function of
the energy cost, hereby is given by:
v(T ) = −C(T ) = −∑n∈T
φn(T ) (4.10)
We first give the following definition from [26] to prove important properties of the collab-
orative computing game model.
Definition 4.1 A coalitional game (N, v) is said to have a transferable utility if the utility
function v(T ) can be arbitrarily apportioned between the coalition’s players. Otherwise, the
coalitional game has a non-transferable utility and each player will have its own utility within
coalition T .
By the definition, we hereby give the first property of collaborative computing model.
Property 4.1 The proposed collaborative computing game (N, v) has a non-transferable util-
ity.
4.4. COALITION FORMATION AMONG MOBILE USERS 48
Proof According to (4.10), the utility of coalition T v(T ) is the negative sum of the energy
cost of each user. Since the energy cost per device is fixed once the task assignment is done,
the utility cannot be transfered among the users. The utility of the coalition T cannot be
arbitrarily apportioned between the coalition’s players, thus the proposed coalitional game has
a non-transferable utility.
Generally, it is assumed that the grand coalition in which all users participate maximizes
the utilities of all users. However, this property is not true in our case in that: on one hand,
within one coalition, if more than |V| users participate in performing the job, it is infeasible to
further split the atomic tasks into subtasks; on the other hand, when the number of participating
users increases, the communication energy cost increases as a result, thus decreases the total
utility. Therefore, for the proposed (N, v) coalitional game we have the second property:
Property 4.2 For the collaborative computing game (N, v), the grand coalition of all users
does not always form. Disjoint independent coalitions will form among the mobile users.
Overall, we have a non-transferable (N, v) coalitional game and our goal is to devise a
distributed algorithms for coalition formation.
4.4.2 Coalition Formation Algorithm
First of all, we give the following definitions which will be used in the derivation of the coali-
tion formation algorithm.
Definition 4.2 A collection is any family T := {T1, ..., Tl} of mutually disjoint coalitions. If
additionally⋃lj=1 Tj = N, the collection T is called a partition of N.
Definition 4.3 Assume A and B are partitions of the same set C, a comparison relation B is
defined as, ABB means that the way A partitions C is preferable to the way B partitions C.
4.4. COALITION FORMATION AMONG MOBILE USERS 49
Each comparison relation B is used only to compare partitions of the same set of players.
Partitions of different sets of players are incomparable w.r.t. B. As illustrated in [25], the com-
parison relations on partitions are induced in a canonic way from the corresponding comparison
relations on multisets of reals as such:
ABB ⇐⇒ v(A)B v(B). (4.11)
The comparison relation B on multisets of reals can be defined in different orders such as util-
itarian order, Nash order, or lexmin order. We consider both coalitional values and individual
values in this work when users choosing which coalition to join or split from. The coalitional
values are values of the coalitional group; utilitarian order is to compare different coalitions by
the total sum of the utility of each user in that coalition. Formaly, utilitarian order is defined as
follows:
ABB ⇐⇒ v(A)B v(B) (4.12)
where v(T ) is exactly what we defined in Sec. 4.4.1 that corresponds to the total energy con-
sumed within the coalition T .
From an individual’s perspective, each user joins or splits from any coalition depending on
its own preferences over the energy consumption on its device while not hurting others’ benefit.
The less its energy consumption, the higher the user’s preference of the coalition. Hence we
consider individual values that for a partition T := {T1, ..., Tl},
φ(T ) := {φn(Ti)|Ti ∈ T, n ∈ Ti}. (4.13)
Given two partitions T = {T1, ..., Tl} and T ′ = {T ′1, ..., T′
k} of the same set of players N, the
comparison relations now compare φ(T ) and φ(T ′), which are multisets of |N| real numbers,
4.4. COALITION FORMATION AMONG MOBILE USERS 50
one player per number. φ(T ) can be also viewed as a sequence of user utilities of the same
length |N|. We compare such sequences using Pareto order:
ABB ⇐⇒ ∀ n, φn(A) ≤ φn(B) and ∃m,φm(A) < φm(B) (4.14)
where A and B are assumed to be of the same length.
With the comparison relation defined in (4.12) and (4.14), we hereby give the following
two rules that allow us to transform partitions of the grand coalition:
merge: {T1, ..., Tk} ∪ P → {⋃kj=1 Tj} ∪ P , where {
⋃kj=1 Tj}B {T1, ..., Tk}.
split: {⋃kj=1 Tj} ∪ P → {T1, ..., Tk} ∪ P , where {T1, ..., Tk}B {
⋃kj=1 Tj}.
Using the above rules, multiple coalitions can merge into a larger one if at least one user
strictly reduces its energy usage. Likewise, one coalition can be split into smaller coalitions.
Because the number of different partitions is finite, every iteration of the merge and split rules
terminates.
Theorem 4.1 Given the comparison relation defined in Def. 4.3, every iteration of merge and
split rules terminates.
However, when different sequences of merge and split rules apply to the initial partition,
the outcome may be different. In the following section, we study under what conditions that it
is guaranteed that arbitrary sequences of these two rules yield the same outcome. Before that,
we illustrates our algorithms based on these two simple rules.
4.4.3 Stability Analysis
In this section, we study the conditions guaranteeing the unique outcome of the iterations of
the merge and split rules. First, we introduce the notion of defection function D that assigns to
4.4. COALITION FORMATION AMONG MOBILE USERS 51
Algorithm 4.1 Collaborative Computing Game through Merge and Split
Input: Initial partition T = {T1, ..., Tl} = NOutput: Final partition T final
repeatT = Merge(T );T = Split(T );
until merge and split terminates.T final = T .
Algorithm 4.2 Merge
Input: Initial partition T = {T1, ..., Tk}Output: Intermediate partition Ffor i ∈ [1, k] do
for j ∈ [i+ 1, k] doif {Ti, Tj}B {{Ti}, {Tj}} thenTi = {Ti, Tj};Remove Tj;for l ∈ [j, k − 1] doTl = Tl+1;
k = k − 1;j = j − 1;
F = {T1, ..., Tk}
Algorithm 4.3 Split
Input: Initial partition T = {T1, ..., Tk}Output: Intermediate partition Ffor i ∈ [1, k] do
Randomly choose Tj ⊂ Tiif {{Tj}, {Ti/Tj}}B {Ti} thenTi = {Ti/Tj};Tk+1 = {Tj};k = k + 1;
F = {T1, ..., Tk}
4.4. COALITION FORMATION AMONG MOBILE USERS 52
each partition some partitioned subsets of the grand coalition. Particularly, Dp is the family of
defection functions that allows formation of all partitions of the grand coalition, and Dc are the
functions that allow formation of all collections in the grand coalition. According to the results
of [25], Dc-stable is important to the unique outcome.
Definition 4.4 A partition P = {P1, ..., Pk} of N is Dc-stable iff the following two conditions
are satisfied.
• for each i ∈ {1, ..., k} and each pair of disjoint coalitionsA andB such thatA∪B ⊆ Pi,
{A ∪B}B {A,B}.
• for each coalition T ⊆ N such that ∀ i ∈ {1, ..., k}, T * Pi,
{T}[P ]B {T},
where
{T}[P ] := {P1 ∩⋃{T}, ..., Pk ∩
⋃{T}}\{∅}.
Adopting the analysis results of [25], we found the following theorem guaranteeing the
unique outcome of merge and split rules.
Theorem 4.2 Assume that B is a comparison relation and P is a Dc-stable partition, then P
is the unique outcome of every iteration of the merge and split rules.
4.5. SIMULATION RESULTS AND ANALYSIS 53
4.5 Simulation Results and Analysis
4.5.1 Simulation Setup
Usually we collect energy usage data by using profilers such as MAUI profilers as in [3], as well
as the computational cycles per task, and the amount of data transferred between different tasks.
The data can be either files or the state transferred between devices for distributedly running
tasks. All these information is taken as input to the task distribution optimization problem to
determine which task should be assigned to which device that minimizes the overall energy
consumption.
In our simulation, we adopted the task graph as shown in Fig. 4.3, and set the computational
cycles of each task to 20− 100 M cycles and the data transferred from 10 to 1000 KB on each
link. As pointed out in [23], the energy consumption of transmitting a fixed amount of data is
related to the available bandwidth condition or connection condition. As it is further verified
in [3], the average transfer energy to download 50 KB data over WiFi is most energy-efficient
with approximately 20 mJ/KB while downloading over GSM or 3G costs around 50 − 200
mJ/KB. However, transmitting data in bad connectivity could consume much more energy
than that in good condition no matter what connection mode the smartphone is in.
In the experiments, we fix the topology of the task graph, and randomly generate resource
connection graph in which every two nodes are linked with each other with given probabil-
ity. The data transfer energy consumption of a link is assigned a value that is uniformly
distributed on [20, 200] mJ/KB, and the computational energy consumption on each device
is uniformly drawn from [40, 60] mJ/M cycles. Given the task and resource graph, each of
the algorithms–centralized, merge-and-split with Pareto order, merge-and-split with utilitarian
order algorithms are run 50 times to get the following results.
4.5. SIMULATION RESULTS AND ANALYSIS 54
4.5.2 Performance Evaluation
In this section, we compare the performance of the merge-and-split algorithms with the cen-
tralized one, and the non-cooperative situation in terms of average energy cost per user, average
running time, and the average coalition size. The centralized algorithm traverses through all
partitions of the given resource graph, and find the optimal partition in which the total sum of
the energy cost is minimized over the entire graph. In the non-cooperative setting, each user
finishes the job individually without incurring any communication cost. Simulation results are
given in Fig. 4.4, Fig. 4.5, and Fig. 4.6 respectively.
Fig. 4.4 (a-d) show the average energy cost per user for different resource graph size. The
energy costs are averaged over randomly generated resource graph while fixing the probabil-
ity that every two nodes are connected. Fig. 4.4(a) represents the situation where most of the
nodes on the resource graph are disconnected nodes, while Fig. 4.4(d) shows the other extreme
that the resource graph is almost a fully-connected graph. The proposed algorithm yields a sig-
nificant reduction in average energy cost per user up to 71.59% related to the non-cooperative
case. This advantage still increases with the total number of users, or the connection ratio of
the users. As the number of users increases as well as the connection ratio, each single user has
many more options to choose the external links from. In another words, the constraints (4.7)
is much more relaxed.
However, as the increase of the connection ratio, the gap between the merge-and-split al-
gorithm and the centralized algorithm also gets bigger. The gap exists because the merge-and-
split algorithm terminates but not necessarily yields a unique outcome. Only parts of the entire
partition set of the resource graph can be traversed before termination. For example, in one
round of simulation, given a certain resource graph of 7 users in total, the optimal partition
should be {{1, 2}, {3, 6, 7}, {4, 5}}, however, the result of merge-and-split by utilitarian order
4.5. SIMULATION RESULTS AND ANALYSIS 55
is {{1, 2, 3}, {4, 5, 6}, {7}}. The partition of the latter cannot be transformed into the former
one by merge and split rules for {1, 2, 3} B {{1, 2}, {3}} so that the partition {1, 2, 3} can-
not be split up. As the connection ratio increases, the difference between the average energy
computed by merge-and-split and that computed by centralized algorithm enlarges for the fol-
lowing reason: when the connection ratio is low, in the optimal structure, users are more likely
to split up rather than to form coalitions with each other, and that can be easily achieved by
split rule. On the other hand, if the connection ratio is high, the optimal structure may not be
obtained by the merge-and-split rule.
In each set of experiments, while the centralized algorithm gives the lower bound, it be-
comes hardly feasible as the total number of users increases. For the number of users is 3, 5, 7,
on average, the mean energy cost of merge-and-split with Pareto order is 6.23% above the
lower bound, while that of utilitarian order is 5.71% above the lower bound. We also observe
that, while merge-and-split with Pareto order incurs more average energy cost than utilitarian
order when the total number of users is less than 10, it incurs less energy cost when the total
number of users is beyond 10. Actually, Pareto order is stricter than utilitarian order in that
whenever the comparison relation satisfies individual order, it satisfies the coalitional order as
well. But the reverse is not necessarily true.
Fig. 4.5(a-d) shows the average coalition size as the underlying topology of the resource graph
varies. There are two trends: the average coalition size is larger if there are more users in total,
or if the user connecting ratio is higher. That is to say, when more users are connected, they tend
to merge to form coalitions. The observation is aligned with the principle that trading commu-
nicational costs for computational costs reduces the overall energy cost. Moreover, the variance
of the average coalition size is higher when users are less connected with each other, largely
because the computational cost takes a significant part in the overall cost. We also observe
4.6. SUMMARY 56
that merge-and-split with Pareto order results in an average coalition size of 1.4516 which is
smaller than that with utilitarian order 1.6034 when the number of users is 10. Viewing Fig. 4.4
and Fig. 4.5 together, it is interesting to see that the average energy cost does not necessarily
drop as the coalition size grows: although the centralized algorithm yields larger coalitions
than merge-and-split while costs less per user, the utilitarian order yields larger coalitions than
Pareto order with higher average energy cost per user.
Fig. 4.6 illustrates the average running time over all user connecting ratio cases on different
number of users for each algorithm. It is shown that the centralized algorithm is highly ineffi-
cient when the number of users exceeds 7 while merge and split performs significantly better
and scales well as the number of users increases.
4.6 Summary
In this work, we study the problem of coalition formation among a group of collaborative mo-
bile users. In order to distribute the tasks while minimizing the energy costs, we formulate the
problem as 0-1 integer programming problem and apply heuristic method to solve it. How-
ever, while assigning the tasks globally in a centralized way is impossible in real world, it is
neither feasible to compute. Thus we devise merge-and-split algorithm in which the decision
of joining or splitting from coalitions is made distributedly on each user only considering the
utility improvement of users in that coalition. We also reveal the conditions under which the
merge-and-split algorithm yields unique outcome in stability analysis. Finally, in the perfor-
mance evaluation, the simulation results show that our algorithm obtains near optimal results,
and is highly efficient over the centralized strategy.
4.6. SUMMARY 57
Number of Users (probability 0.1)3 5 7 10 15 20
Ave
rag
e E
ne
rgy C
ost
pe
r U
se
r (m
J)
8000
8100
8200
8300
8400
8500
8600
8700Non-cooperateMerge-and-split (Pareto order)Merge-and-split (utilitarian order)Centralized
Number of Users (probability 0.35)3 5 7 10 15 20
Ave
rag
e E
ne
rgy C
ost
pe
r U
se
r (m
J)
6800
7000
7200
7400
7600
7800
8000
8200
8400
8600
8800
Non-cooperateMerge-and-split (Pareto order)Merge-and-split (utilitarian order)Centralized
(a) (b)
Number of Users (probability 0.6)3 5 7 10 15 20
Ave
rag
e E
ne
rgy C
ost
pe
r U
se
r (m
J)
6000
6500
7000
7500
8000
8500
9000
Non-cooperateMerge-and-split (Pareto order)Merge-and-split (utilitarian order)Centralized
Number of Users (probability 0.95)3 5 7 10 15 20
Ave
rag
e E
ne
rgy C
ost
pe
r U
se
r (m
J)
4500
5000
5500
6000
6500
7000
7500
8000
8500
9000
Non-cooperateMerge-and-split (Pareto order)Merge-and-split (utilitarian order)Centralized
(c) (d)
Figure 4.4: Average energy cost per user when users are non-cooperate, running centralized,merge and split algorithms using Pareto order and utilitarian order. (a) User connecting ratio0.10. (b) User connecting ratio 0.35. (c) User connecting ratio 0.60. (d) User connecting ratio0.95.
4.6. SUMMARY 58
Number of Users (probability 0.1)3 5 7 10 15 20A
ve
rag
e C
oa
litio
n S
ize
(N
um
be
r o
f U
se
rs p
er
Co
alit
ion
)
0.99
1
1.01
1.02
1.03
1.04
1.05
1.06
1.07
1.08
CentralizedMerge-and-split (Pareto order)Merge-and-split (utilitarian order)
Number of Users (probability 0.35)3 5 7 10 15 20A
ve
rag
e C
oa
litio
n S
ize
(N
um
be
r o
f U
se
rs p
er
Co
alit
ion
)1.1
1.15
1.2
1.25
1.3
1.35
1.4
CentralizedMerge-and-split (Pareto order)Merge-and-split (utilitarian order)
(a) (b)
Number of Users (probability 0.6)3 5 7 10 15 20A
ve
rag
e C
oa
litio
n S
ize
(N
um
be
r o
f U
se
rs p
er
Co
alit
ion
)
1.2
1.3
1.4
1.5
1.6
1.7
1.8
CentralizedMerge-and-split (Pareto order)Merge-and-split (utilitarian order)
Number of Users (probability 0.95)3 5 7 10 15 20A
ve
rag
e C
oa
litio
n S
ize
(N
um
be
r o
f U
se
rs p
er
Co
alit
ion
)
1.4
1.6
1.8
2
2.2
2.4
2.6
2.8
CentralizedMerge-and-split (Pareto order)Merge-and-split (utilitarian order)
(c) (d)
Figure 4.5: Average coalition size when users running centralized and merge and split algo-rithms. (a) User connecting ratio 0.10. (b) User connecting ratio 0.35. (c) User connectingratio 0.60. (d) User connecting ratio 0.95.
4.6. SUMMARY 59
Number of Users3 5 7 10 15 20
Avera
ge R
unnin
g T
ime (
s)
0
10
20
30
40
50
60
70
CentralizedMerge-and-split (Pareto order)Merge-and-split (utilitarian order)
Figure 4.6: Average running time when users running centralized and merge and split algo-rithms.
Chapter 5
Conclusion
With the focus on energy-efficient mobile offloading, this thesis studied two major problems:
coalesced offloading from mobile devices to the cloud, and coalition formation among collabo-
rative mobile users. In essence, mobile offloading achieves energy saving by trading relatively
low communicational costs with high computational costs. The main contributions of the two
works are listed below.
In Chapter 3, we proposed the idea of coalesced offloading that batches offloading requests
from multiple applications to reduce the period that the smartphone stays at the high-power
state. The problem is formulated as a joint optimization problem concerning both the energy
consumption and the response time. An offline solution is designed to serve as the performance
benchmark and an online algorithm is derived that achieves optimal competitive ratio.
In Chapter 4, we studied the problem that how a group of mobile users collaborate with
each other on one job. The problem is formulated as a coalitional game in which users choose
to join or split from coalitions depending on the energy cost of the coalition. The energy cost
is computed based on the tasks assignment to all users within the coalition. While finding
the optimal partition of the coaltion group is NP-hard, we tackled the problem by adopting
60
CHAPTER 5. CONCLUSION 61
merge-and-split algorithm, and verified its efficiency through simulations.
While this thesis has raised and solved some practical problem in the area of mobile cloud
computing, there are many more interesting problems pointing to new challenging directions
for future research. We detail some of them in the following.
• Optimal mobile offloading with network performance prediction. In previous literatures,
it is assumed that the smartphone has stable network connections during the offloading
process. This assumption is not valid in practice, especially when users carry phones
around. When the connection is not sufficiently good, offloading to the cloud may cause
battery to drain out rather than to save energy. Hence it is a very practical consideration to
incorporate user mobility and network connection prediction into the mobile offloading
framework. With regards to the prediction aspect, machine learning techniques may be
used.
• Optimal mobile offloading for specific types of applications. Until very recently, most
mobile offloading frameworks are designed for general applications. However, it is not
efficient to apply this general framework to every applications. For applications with
certain characteristics, for example, a typical learning algorithm minimizes the objective
function to obtain the model, and iteratively refines this model by processing the training
data. For another example, an iBeacon app may constantly gather data from seen bea-
cons and upload information to the cloud. All these applications need mobile offloading
framework optimized specifically for them. Hence, the questions raised is not simply to
offload or not any more, but may take the application structure into consideration.
Bibliography
[1] B.-G. Chun, S. Ihm, P. Maniatis, M. Naik, and A. Patti, “Clonecloud: elastic execution
between mobile device and cloud,” Proceedings of the sixth conference on Computer
systems, 2011.
[2] M. S. Gordon, D. A. Jamshidi, S. Mahlke, Z. M. Mao, and X. Chen, “Comet: Code offload
by migrating execution transparently,” Proceedings of the 10th USENIX conference on
Operating Systems Design and Implementation, 2012.
[3] E. Cuervo, A. Balasubramanian, D.-k. Cho, A. Wolman, S. Saroiu, R. Chandra, and
P. Bahl, “Maui: making smartphones last longer with code offload,” Proceedings of the
8th international conference on Mobile systems, applications, and services, 2010.
[4] B.-G. Chun and P. Maniatis, “Augmented Smartphone Applications Through Clone Cloud
Execution,” USENIX HotOS Workshop, 2009.
[5] N. Balasubramanian, A. Balasubramanian, and A. Venkataramani, “Energy Consumption
in Mobile Phones: a Measurement Study and Implications for Network Applications,” in
Proc. 9th ACM SIGCOMM Conf. on IMC, 2009.
[6] M. Satyanarayanan, P. Bahl, R. Caceres, and N. Davies, “The case for vm-based cloudlets
in mobile computing,” Pervasive Computing, 2009.
62
BIBLIOGRAPHY 63
[7] A. Amiri Sani, K. Boos, M. H. Yun, and L. Zhong, “Rio: a system solution for sharing
i/o between mobile systems,” Proceedings of the 12th annual international conference on
Mobile systems, applications, and services, 2014.
[8] S. Kosta, A. Aucinas, P. Hui, R. Mortier, and X. Zhang, “ThinkAir: Dynamic Resource
Allocation and Parallel Execution in the Cloud for Mobile Code Offloading,” in Proc.
IEEE INFOCOM, 2012.
[9] A. R. Karlin, C. Kenyon, and D. Randall, “Dynamic TCP Acknowledgment and Other
Stories about e/(e-1),” in Proc. 33rd ACM Symposium on Theory of Computing, 2001.
[10] D. R. Dooly, S. A. Goldman, and S. D. Scott, “TCP Dynamic Acknowledgment Delay:
Theory and Practice,” in Proc. 30th ACM Symposium on Theory of Computing, 1998.
[11] W. Wang, B. Li, and B. Liang, “To Reserve or Not to Reserve: Optimal Online Multi-
Instance Acquisition in IaaS Clouds,” in Proc. IEEE/ACM ICAC, 2013.
[12] P. Baptiste, “Scheduling Unit Tasks to Minimize the Number of Idle Periods: a Poly-
nomial Time Algorithm for Offline Dynamic Power Management,” in Proc. 17th ACM-
SIAM Symposium on Discrete Algorithm, 2006.
[13] E. Miluzzo, R. Caceres, and Y.-F. Chen, “Vision: mclouds-computing on clouds of mo-
bile devices,” Proceedings of the third ACM workshop on Mobile cloud computing and
services, 2012.
[14] M. Guirguis, R. Ogden, Z. Song, S. Thapa, and Q. Gu, “Can you help me run these code
segments on your mobile device?” Global Telecommunications Conference (GLOBE-
COM 2011), 2011 IEEE, 2011.
BIBLIOGRAPHY 64
[15] C. Shi, V. Lakafosis, M. H. Ammar, and E. W. Zegura, “Serendipity: enabling remote
computing among intermittently connected mobile devices,” Proceedings of the thirteenth
ACM international symposium on Mobile Ad Hoc Networking and Computing, 2012.
[16] T. Langford, Q. Gu, A. Rivera-Longoria, and M. Guirguis, “Collaborative computing on-
demand: Harnessing mobile devices in executing on-the-fly jobs,” Mobile Ad-Hoc and
Sensor Systems (MASS), 2013 IEEE 10th International Conference on, 2013.
[17] M. Y. Arslan, I. Singh, S. Singh, H. V. Madhyastha, K. Sundaresan, and S. V. Krish-
namurthy, “Computing while charging: building a distributed computing infrastructure
using smartphones,” Proceedings of the 8th international conference on Emerging net-
working experiments and technologies, 2012.
[18] X. Jin and Y.-K. Kwok, “Cloud assisted p2p media streaming for bandwidth constrained
mobile subscribers,” Parallel and Distributed Systems (ICPADS), 2010 IEEE 16th Inter-
national Conference on, 2010.
[19] W. Saad, Z. Han, T. Basar, M. Debbah, and A. Hjorungnes, “A selfish approach to coali-
tion formation among unmanned air vehicles in wireless networks,” Game Theory for
Networks, 2009. GameNets’ 09. International Conference on, 2009.
[20] “Powergremlin.” [Online]. Available: https://github.com/palominolabs/powergremlin
[21] “Cocoa xml-rpc framework.” [Online]. Available: https://github.com/corristo/xmlrpc
[22] “Wireshark.” [Online]. Available: http://www.wireshark.org/
[23] P. Shu, F. Liu, H. Jin, M. Chen, F. Wen, Y. Qu, and B. Li, “etime: energy-efficient trans-
mission between cloud and mobile devices,” INFOCOM, 2013 Proceedings IEEE, 2013.
BIBLIOGRAPHY 65
[24] T. Sandholm, K. Larson, M. Andersson, O. Shehory, and F. Tohme, “Coalition structure
generation with worst case guarantees,” Artificial Intelligence, 1999.
[25] K. R. Apt and A. Witzel, “A generic approach to coalition formation,” International Game
Theory Review, 2009.
[26] R. B. Myerson, “Game theory: analysis of conflict,” Harvard University, 1991.