A Game Theoretic Perspective on Self-organizing ... · Index Terms—5G, cognitive small cells,...

arX

iv:1

503.

0020

5v2

[cs.

IT]

5 M

ar 2

015

1

A Game Theoretic Perspective on Self-organizingOptimization for Cognitive Small CellsYuhua Xu, Jinlong Wang, Qihui Wu, Zhiyong Du, Liang Shen and Alagan Anpalagan

Abstract—In this article, we investigate self-organizing opti-mization for cognitive small cells (CSCs), which have the abilityto sense the environment, learn from historical information, makeintelligent decisions, and adjust their operational parameters.By exploring the inherent features, some fundamental challengesfor self-organizing optimization in CSCs are presented anddis-cussed. Specifically, the dense and random deployment of CSCsbrings about some new challenges in terms of scalability andadaptation; furthermore, the uncertain, dynamic and incompleteinformation constraints also impose some new challenges intermsof convergence and robustness. For providing better service tothe users and improving the resource utilization, four require-ments for self-organizing optimization in CSCs are presentedand discussed. Following the attractive fact that the decisionsin game-theoretic models are exactly coincident with thoseinself-organizing optimization, i.e., distributed and autonomous,we establish a framework of game-theoretic solutions for self-organizing optimization in CSCs, and propose some featuredgame models. Specifically, their basic models are presented, someexamples are discussed and future research directions are given.

Index Terms—5G, cognitive small cells, self-organizing opti-mization, game-theoretic models, distributed learning

I. I NTRODUCTION

SMALL cells have been regarded as a promising approachto meet the increasing demand of cellular network capac-

ity. In comparison to macro-cells, low-cost small cells operat-ing with low-power and short range offer a significant capacitygain due to spatial reuse of spectrum. Researchers in thecommunity have realized that enabling cognitive ability intosmall cells, which is referred to as cognitive small cells (CSCs)[1], would further improve the resource utilization. Similar tocognitive radio, CSCs are able to sense the environment, learnfrom historical information, make intelligent decisions,andadjust their operational parameters.

It is expected that small cells are to bedenselydeployedin the near future. Furthermore, small cells may be deployedby the mobile operators, the enterprises or households, whichmeans that they would operate in a self-organized, dynamicand distributed manner. Thus, resource optimization problemsfor small cells, e.g., spectrum sharing, carrier selectionand

This work was supported by the National Science Foundation of Chinaunder Grant No. 61401508 and No. 61172062.

Yuhua Xu, Jinlong Wang, Qihui Wu, Zhiyong Du and Liang Shen arewith the College of Communications Engineering, PLA University ofScience and Technology, Nanjing, China (e-mail: [email protected],[email protected], [email protected], [email protected],[email protected] ).

Alagan Anpalagan is with the Department of Electrical and Computer Engi-neering, Ryerson University, Toronto, Canada (e-mail: [email protected]).

power control, interference management, and offloading mech-anism, can not be solved in a centralized manner since itresults in heavy communication overhead and can not adapt todynamic environment. As a result, it is important and timelyto develop self-organizing optimization approaches for CSCs.

In this article, by exploring the inherent features of CSCs,we first discuss and analyze some fundamental challengesand requirements for self-organizing optimization in CSCs.Following the attractive advantages of game-theoretic modelsfor self-organizing optimization, we propose some featuredgame-theoretic solutions. It should be pointed out that thereare also some other useful approaches for self-organizingoptimization in distributed wireless networks, e.g., the swarmintelligence inspired evolutionary algorithms [2]. The reasonsfor using game-theoretic solutions are: i) the interactionsamong multiple decision-makers can be well modeled and ana-lyzed, and ii) the outcome of the game is predicable and hencethe system performance can be improved by manipulating theutility function and the action update rule of each decision-maker.

Game-theoretic models have been investigated extensivelyin wireless communications, and there are some preliminarygame-theoretic solutions for CSCs, e.g., reinforcement learn-ing with logit equilibrium for power control [3], a hierarchicaldynamic game approach for spectrum sharing and service se-lection [4], and evolutionary game for self-organized resourceallocation [5]. The presented models in this article mainlyaddress the inherent features, fundamental requirements andchallenges of CSCs and hence differentiate significantly fromprevious ones seen in the literature. In fact, the main objectiveof this article is to propose and discuss the featured game-theoretic models suitable for CSCs.

The rest of this article is organized as follows. In Section II,the cognition functionality for CSCs is presented. In SectionIII, some fundamental challenges and requirements for CSCsare discussed. In Section IV, some featured game-theoreticmodels for self-organizing optimization in CSCs are presented,and future research directions are given. Finally, we provideconcluding remark in Section IV.

II. COGNITION AND SELF-ORGANIZING OPTIMIZATION

FOR COGNITIVE SMALL CELLS

We first present the cognition functionality for CSCs, whichis the base for self-organizing optimization. The cognitionfunctionality in cognitive radio is mainly concerned withacquiring spectrum availability information, i.e., sensing andidentifying spectrum opportunities in time, frequency and

http://arxiv.org/abs/1503.00205v2

2

space domains. To capture the complex environment andnetwork state, the cognition functionality in CSCs is ex-tended to explore multi-dimensional information. Such multi-dimensional information is referred to ascontextual informa-tion, which is used to identify an object of interest.

As illustrated in Fig. 1, the target contextual informationincludes user type, user demand, access content, user per-ception, location, energy, network state and spectrum state.For presentation, we briefly illustrate the above contextualinformation. The user type represents the hardware categoryi.e., tablet, phone or a laptop. Access content is relatedto specific application, e.g., browsing a breaking news ordownloading an APP. User perception is related to the servicequality experienced by the user, while location is where thesmall cell is, e.g., indoor or outdoor. Energy is related tothe power for wireless transmission and cooling. Networkstate is related to the current network deployment of smallcells and spectrum state is related to the spectrum availability.Technically, the above contextual information has great impacton the resource allocation schemes, which will be discussedlater.

As shown in Fig.1, the contextual information in CSCs canbe obtained by the following three methods: sensing, database,and information exchange among neighbors. i) Sensing: withthe help of cognitive radio technologies, CSCs perform sens-ing to obtain useful information. For example, the spectrumoccupancy state can be obtained by energy detection or featuredetection (e.g., pilot, modulation type, cyclic prefixes, andcyclostationarity). Sensing is real-time but consumes resourcesincluding time, energy and bandwidth. ii) Database: databaseapproach is a powerful tool to provide useful informationfor CSCs. For example, spectrum database has recently beendeveloped to provide spectrum occupancy state for a particularregion, through which the CSCs can inquire the location-dependent spectrum availability information. Compared withsensing, database approach is more efficient but is not real-time. iii) Information exchange among neighbors: since theCSCs are connected to the core network via cable or opticalfiber, information exchange among CSCs are feasible. How-ever, it is to be noted that local information exchange betweenneighbors is more desirable since global information exchangeleads to heavy communication and signalling overhead.

We argue that the term “cognitive” in CSCs is not limited toobserving the environment and acquiring contextual informa-tion. Instead, it should include having high-level intelligence.To achieve such high-level intelligence for CSCs, the mostpromising way is to realize knowledge discovery from the con-textual information. Generally speaking, knowledge is a broadconcept including general principles and natural laws. Takingthe spectrum dynamics as an example, the probabilityθ thata particular band is occupied by the macro-cells during 01:00AM to 04:00 AM is very small, e.g.,θ = 0.05, is viewed asthe knowledge in CSCs. Based on the contextual information,the CSCs can build knowledge base which contains usefulknowledge for self-organizing optimization.

III. C HALLENGES AND REQUIREMENTS FOR

SELF-ORGANIZING OPTIMIZATION IN COGNITIVE SMALL

CELLS

In this section, by exploring the inherent features of CSCs,we briefly discuss some fundamental challenges and require-ments for self-organizing optimization in CSCs.

A. Technical challenges

The technical challenges for self-organizing optimizationin CSCs are discussed from the perspectives of networkdeployment and information constraints respectively.

First, from the perspective of network deployment of CSCs,two challenges arise in the following two aspects:

• Scalability. With the increase in the number of smallcells, how would the self-organizing optimization solu-tions scale up? This is the first basic issue of densedeployment in CSCs. In addition, since the decisionsof CSCs are interactive, addressing the complicated in-teractions among the densely deployed CSCs is anotherimportant issue.

• Random deployment. The small cells are deployed bydifferent entities, e.g., mobile operators, the enterprise orhouseholds. In addition, they may turn to be inactive ifthere is no serving client. As a result, the deployment ofsmall cells is always random and dynamic. Thus, it isimportant for the self-organizing optimization solutionsto behave in random and dynamic environment.

Secondly, it is known that information is key to optimizationproblems, and the challenges related to information arising inthe CSCs are listed below:

• Uncertain: the observed information may not be the samewith the true information. A well-known example is thatthe sensed spectrum states are always imperfect due tothe corruption of noise.

• Dynamic: the observed information is time-varying thedynamic changes are not determinate. For example, thenetwork and spectrum states may change from time totime, and the set of active CSCs may also change fromtime to time. Furthermore, the network and spectrumstates in each decision period are random, the set of activeCSCs are random, and their demands are also random.

• Incomplete: due to the constraints in hardware andresource consumption, each CSC only has partial infor-mation about the environment; furthermore, it only hasinformation of its neighboring CSCs (in some extremescenarios, it has no information of others). In addition,a CSC does not know the total number of small cells inthe systems, not to mention the active ones.

Due to the above technical challenges, it is seen that thetask of resource optimization in CSCs is hard to solve evenin a centralized manner, not to mention in a distributed andself-organizing manner.

B. Requirements for self-organizing optimization

We list some fundamental requirements for self-organizingoptimization in CSCs. Specifically, these requirements are

3

Database

Sensing

Information Exchange

among Neighbors

Context

Cognition

Energy

Location

User Perception

User Demand

Self-organizing

Optimization

Performance

Evaluation and

Feedback

Adaptation and

Re-configuration

Cognition and Self-organizing

Optimization for Cognitive Small Cells

User type

Spectrum State

Network State

Knowledge

Base

Heterogeneity

Robustness

Scalability

User-centric

optimizationAccess content

Fig. 1. The paradigm of cognition functionality and self-organizing optimization in CSCs.

for user service, network deployment (architecture), and op-timization methodology. As shown in Fig. 1, based on thecontextual information and knowledge, some self-organizingoptimization approaches can be applied for resource allocationin CSCs. By employing their inherent features, we discusssome featured requirements of self-organizing optimizationin CSCs, which mainly include user-centric optimization,scalability, robustness and heterogeneity.

First, it should shift from throughput-oriented optimizationto user-centric optimization. Traditionally, resource optimiza-tion schemes in wireless systems are throughput-oriented,withthe objective to maximize throughput/capacity or minimizedelay. However, it is now realized that throughput-orientedschemes can not provide satisfactory service for the users.Infuture mobile communication systems, there is an increasingtrend to develop user-centric optimization schemes ratherthan throughput-oriented schemes. The underlying reasonsaretwofold: (i) eventually, the purpose of (wireless) communica-tion is to serve end users. Thus, the contextual informationof the users, e.g., their locations, demands, access contentsand energy, should be taken into account not only in high-layers but also in PHY and MAC layers for optimization,(ii) it is realized that mobile (cellular) systems have migratedtowards data and Internet services. In particular, multimediaservice delivery through cellular systems, e.g., watchingonlinevideo, is becoming common. For this kind of service, peoplemay not care about the specific volume of allocated resources,but sensitively react to the perceived service quality, which isknown as quality of experience (QoE) [6]. It means that userperception should also be taken into account in self-organizingoptimization.

Secondly, it should admit scalability and address networkdensity. As stated before, it is expected that the CSCs aredensely deployed with large numbers. A consequence is thatthe resource optimization for dense deployment is completelydifferent than that in sparse environment. Thus, the self-organized optimization schemes should scale up in densedCSCs. In addition, density creates congestion and interference

among CSCs, which implies that efficient congestion controland interference mitigation approaches should be developed.

Thirdly, it should be robust to the dynamic environment.As discussed before, there are several random and dynamicfactors in CSCs, e.g., the spectrum availability is dynamic, theCSCs switch between active and inactive randomly. Moreover,the observed information may be corrupted by noise. Thus,the self-organizing optimization solutions should be robust toaddress the randomness, dynamics, and uncertainty in CSCs.

Last, it should address the hierarchical decision-making inCSCs. There are always heterogeneous cells with overlappingcoverage in future wireless systems, i.e., macro-cells andsmallcells. In such hierarchical networks, the cells in differentlayers have different priority and utility functions. Thatis, itinvolves heterogeneous decision-makers. However, traditionalself-organizing optimization solutions are mainly for homoge-neous decision-makers. Thus, it is important to develop newhierarchical self-organizing solutions for CSCs.

IV. GAME-THEORETICSELF-ORGANIZING OPTIMIZATION

FOR COGNITIVE SMALL CELLS

Game theory [7] is an applied mathematic tool to modeland analyze mutual interactions in multiuser decision systems.Generally, a game model consists of a set of players, a setof available actions of each player, and a utility function thatmaps the action profiles of the all the players into a real value.There are two major branches of game-theoretic models: non-cooperative games and cooperative games. From a high-levelperspective of comparison, players in a non-cooperative gamemake rational decisions to maximize their individual utilityfunctions, while players in a cooperative game are groupedtogether according to an enforceable agreement for payoffallocation. In a non-cooperative game, the commonly usedsolution concepts are Nash equilibrium (NE) and correlatedequilibrium.

Researchers began to apply game-theoretic models intowireless communications a decade ago; nowadays, it has beenregarded as a powerful tool for wireless resource allocation

4

Self-organizing optimization problems in cognitive

small cells, e.g., spectrum access, power control,

and offloading mechanism.

Game formulation and

analysis

Design of multiuser

learning algorithms

Make the stable solutions

optimal (near-optimal)

Find the optimal (better)

stable solutions under

practical constraints in

wireless environment

Solve the original

optimization problems

distributively and

autonomously.

Fig. 2. The proposed framework of game-theoretic solutionsfor self-organizing optimization in CSCs.

optimization, e.g., power control, spectrum access, networkselection, spectrum auction and trading, and incentive mecha-nism design. The decisions of the players in (non-cooperative)game-theoretic models are distributed and autonomous, whichis an exact coincidence with those in self-organizing optimiza-tion. Thus, game-theoretic approach is important to achieveself-organizing optimization in CSCs [8].

A. Framework of game-theoretic self-organizing optimization

To cope with the technical challenges in CSCs, i.e., denseand dynamic deployment, and uncertain, dynamic and in-complete information constraints, we propose a framework ofgame-theoretic self-optimizing optimization, which is shownin Fig. 2. It is noted that there are two key steps: i) gameformulation and analysis, and ii) design of multiuser learningalgorithm. In comparison, game formulation and analysisbelongs to theoretical investigation while developing learningalgorithms belongs to the field of algorithm methodology.On one hand, the stable solutions are the inherent propertiesof game-theoretic models, and not relevant to the learningalgorithms. On the other hand, except for the utility function ingame-theoretic models, the uncertain, dynamic and incompleteinformation constraints have great impact on the convergenceand performance of learning algorithms. With the two stepscarefully designed, self-organizing optimization for theorigi-nal wireless optimization problems can be achieved.

1) Game formulation and analysis:For game formulation,one needs to first identify the player and available action set,and define suitable utility functions for the players. For CSCs,the player may be a single entity, e.g., small base stationor the user equipment, or a collection of multiple entities,e.g., a cluster consisting of multiple nearby small cells. Theavailable action set can be regarded as a combination ofmultiple optimization variables. Defining utility function iskey to game formulation since it eventually determines theproperties and performance of the game-theoretic models.

The most efficient game-theoretic model used in wirelessnetworks is potential game [9], in which there is a potentialfunction such that the change in the utility function causedby the unilateral action change of an arbitrary player hasthe same trend with that in the potential function, i.e., both

increasing or decreasing. Potential game has at least one purestrategy NE and all NE points are global or local maxima ofthe potential function. Thus, the NE solutions are desirable ifthe potential function is related to the original optimizationobjective. Furthermore, to ensure that the stable solutions ofgame-theoretic models are optimal (near-optimal), an anotherefficient method is to define the utility function as the receivedpayoff minus cost of using the amount of particular resource.

2) Design of multiuser learning algorithms :Identifyingthe stable solutions of game-theoretic models is one thing,butfinding them is another different thing. This issue, however,was underestimated in previous studies. In pure game theory,players can monitor the environment and other players, whichmeans that they have perfect information about the actions andpayoffs of other players. As discussed above, this assumptiondoes not hold in CSCs. With the cognition functionality ofCSCs, players need to observe the results of multiuser interac-tions, e.g., interference, collision and competition, learn usefulinformation from the limited feedback, and then adjust theirbehaviors towards some desirable solutions. In the contextofgame optimization, the objective of multiuser learning is toconverge to a stable solution with good performance.

Denotean(k) as the action of playern in thekth iteration,anda−n(k) as the action profile of all other players exceptn.Due to the interactions (interference, congestion or competi-tion) among players, the received payoffrn(k) of each playeris jointly determined by the action profile of all players, and itmay be deterministic or random. Generally, the players updatetheir actions based on the current action-payoff information{

an(k), a−n(k); rn(k), r−n(k)}

. Thus, the system evolutioncan be described as{an(k), a−n(k)} → {rn(k), r−n(k)} →{an(k + 1), a−n(k + 1)}, and the objective is to converge toa stable action profile that maximizes the system utility.

The uncertain, dynamic and incomplete information con-straints in CSCs may pose some new challenges. Specifically,i) a player does not know the information about all otherplayers, i.e.,a−n(k) andr−n(k) are unknown, ii) the receivedpayoff rn(k) may be random and time-varying. Thus, the up-date rule needs to be carefully designed to guarantee the con-vergence towards desirable solutions. When local informationamong neighboring players is available, it is desirable to de-velop partially uncoupled learning algorithms based on thepar-tial action-payoff information

{

an(k), aJn(k); rn(k), rJn

(k)}

,whereaJn

(k) andrJn(k) are the action-payoff information of

the neighboring players. In some extreme scenarios with noinformation exchange available, one needs to develop fullyuncoupled learning algorithms based on the individual action-payoff information

{

an(k), rn(k)}

. There are some usefulpartially coupled learning algorithms, e.g., local altruisticbehavior with spatial adaptive play [10], and fully uncoupledlearning algorithms, e.g., stochastic learning automata [11],that can be applied for self-organizing optimization in CSCs.

B. Some featured game-theoretic models for CSCs

In methodology, previous game-theoretic models in wirelesscommunications can be applied to CSCs. However, mostprevious game-theoretic solutions mainly focused on analyzing

5

10 20 30 40 50 60 70 800.4

0.5

0.6

0.7

0.8

0.9

1

The number of cognitive small cell

Sat

isfie

d us

er r

atio

The proposed concave satisfactionThe linear satisfactionThe sigmoid satisfactionThroughput maximization

Fig. 3. The comparison results of satisfied user ratio for different satisfactionutility functions.

the properties of game-theoretic models in ideal scenarios, anddid not take into account the challenges and requirements inCSCs. In this subsection, we propose some featured gamemodes for CSCs. Due to the low-power, the transmission ofa small cell only affects its neighbors; as a result, graphicalgames [10] are appropriate for small cell networks.

1) Demand-aware game:Most existing resource optimiza-tion approaches mainly focused on maximizing the resourceutilization, while ignoring the actual demand of the users.In future CSCs, demand-aware design and decision is moredesirable. To include the user demand into the resource op-timization problems, a useful method is to map the allocatedresource to the user satisfaction utility. Specifically, denoteusern’s demand asdn and the allocated resource asrn, thenits satisfaction utility can be expressed assn(rn, dn).

Generally, there are two kinds of satisfaction functions inthe literature: i) the linear satisfaction: the satisfactory utilityis determined byrn

dn

if rn ≤ dn, and is equal to one otherwise,and ii) the sigmoid satisfaction: the satisfactory utilityfunctionis generally determined by 1

1+e−c(rn−dn) , wherec is used toadjust the slope of the satisfaction utility curve around theuser demanddn with different types of traffic. In particular,real-time traffic such as online video are sensitive to acquiredresources and require strict performance requirements, whichcorresponds to large value ofc, while non-real-time trafficsuch as E-mail or file transfer are less sensitive, whichcorresponds to small value ofc. The linear and sigmoidsatisfaction functions have been well investigated in previousgame-theoretic wireless resource optimization problems.Asthe satisfaction utility function is strictly increasing,each userproceeds to compete for wireless resource even if the obtainedresource is larger than the demand, which would decreasethe satisfaction of others. However, this drawback has notaddressed in previous work.

To improve the network satisfaction, an efficient approachis to prevent the users compete for extra resources when itis satisfied and to decrease the satisfaction utility when itoccupies additional resources. Based on this intuition, the con-cave satisfaction utilities may be more suitable for multiuser

1x

2x

2 1( )u BR x

1 2( )u BR x

Nash equilibrium

*

1x

(a) The illustrative diagram of Nash equilibrium

for games with continuous utility function

1x

2 1( )u BR x

1 2( )u BR x

Region of Nash equilibrium

1

lx

1

hx

2

lx

2

hx

(b) The illustrative diagram of Nash equilibrium for

QoE-aware game with non-continuous utility function

2x

*

2x

Fig. 4. An illustrative expansion of Nash equilibrium of discrete-QoE-aware game.BR(x2) (BR(x1)) denotes the best response curve of player1 (player 2) with respect to the decision variable of the other player. In(a), the intersection point(x∗

1, x∗

2, ) is Nash equilibrium. In (b), due to the

discontinuous feature of QoE-aware game, the Nash equilibrium is expandedto the shadow region.

communication networks. An example of concave satisfactionutilities is given by

(

2√rndn

dn+rn

)α, whereα is used to adjust

the slope of the utility curve for different types of traffic.Forillustration, we consider the problem of distributed spectrumaccess for CSCs. Specifically, the CSCs are randomly locatedin a region of 100m× 100m, and the sensing-based spectrumaccess protocol proposed in [1] was applied. The problemof distributed spectrum access is formulated as a graphicalgame and the learning automata [11] is applied. Differenttypical applications, such as G.711PCM, WMV, AVI/RM,Flash, H.264, are considered in the simulation. The compar-ison results are presented in Fig. 3. It is noted that with theproposed satisfaction function, the satisfaction user ratio islargely improved. In particular, as the network scales up, thethroughput gain becomes significant.

2) Discrete-QoE-aware game:Eventually, the purpose ofwireless communications is to serve people. Thus, the per-ception by people, i.e., QoE, should be included in the gameformulation. Unlike the satisfaction function which is charac-terized by continuous and real values, the perception of peopleis generally subjective and discrete. For example, a personmay feel “Excellent”, “Good”, “Fair”, “Poor” and “Bad”,by the method of the mean opinion score [6]. Comparedwith traditional continuous optimization game, an interestingresult of discrete-QoE-aware game is the expansion of Nashequilibrium, which is shown and illustrated in Fig. 4.

In traditional continuous optimization game, users maximizetheir throughput as there is an inherent principle: largerthroughput is always better. On the contrary, a user in discrete-QoE-aware games does not always maximize its throughputunless its QoE level can be improved, e.g., from “Poor” to“Fair”. Thus, it can be expected that the discrete-QoE-awaregame would improve the network QoE.

To further show the benefit of discrete-QoE-aware game, weconsider the problem of distributed user association in LTE-Asmall cell networks. For users located in the overlapping areas,there are multiple small cell access points (SAPs) availableand the users need to choose one to associate. Considerthree types of video calling users using Skype: i) the firstone is the group video calling with the required minimalthroughputRm = 512kbps and the recommended throughput

6

Excellent Good Fair Poor Bad0

2

4

6

8

10

12N

umbe

r of

use

rs

Best NE of discrete−QoE−aware gameBest NE of continuous throughput gameThroughput maximization

Fig. 5. The number of users in different QoE levels of different solutions.

Rc = 2Mbps, ii) the second one is the high definition videocalling with Rm = 1.2Mbps andRc = 1.5Mbps, and iii)the third one is the general video calling user with the withRm = 128kbps andRc = 500kbps. Each user falls intoone of the above three types with equal probabilities. It isbelieved that the minimal throughput only supports the basicuser demand (“Poor”), while the recommended throughputoffer sufficiently good user experience (“Good”). With themethod proposed in [12], the throughput thresholds for otherQoE levels (“Excellent”, “Fair” and “Bad”) can be obtainedaccordingly.

Considering a network with 78 users which can access onlyone SAP, and 20 users which are located in the overlappingregions of neighboring CSCs, the comparison results of thenumber of users in different QoE levels are shown in Fig. 5.It is seen that the discrete-QoE-aware game outperforms thecontinuous optimization game. Specifically, with the discrete-QoE-aware game, 12 users are in “Excellent”, one in “Good”,two in “Fair”, and five in “Bad”; with the continuous opti-mization game, 11 users are in “Excellent”, none in “Good”,three in “Fair” and five in “Bad”. Also, it is noted thatthe throughput maximization approach (the user demand isneglected) achieves poor network QoE. This result validatesthe superiority of the discrete-QoE-aware game.

3) Hierarchical game:As stated before, CSCs would in-teract with macro-cells for dynamic spectrum sharing andmobile offloading. While originally studied in the economiccontext of duopolies in which one company has the powerto act before the other companies, Stackelberg game, whichis an important kind of hierarchical games, is suitable forsystems that contain a natural hierarchy. Therefore, to addressthe hierarchical decision-making between macro-cells (alwaysacting as a leader) and CSCs (always acting as followers),Stackelberg game is becoming a useful tool [4].

In addition, following the idea of “divide-and-conquer”,hierarchical game can also be used to address the dense de-ployment of CSCs. In particular, in order to ease the challengescaused by the large number of participants, we can createhierarchy to transform the large-scale optimization probleminto several layers of sequential sub-problems. To achieve

2

1

3 4

5

6 7

2

1

3 4

5

6 71CH

2CH

2C1C

2

1

3 4

5

6 71CH

2CH

2C1C

2

1

3 4

5

6 7

2C1C

(a) (b)

(c)(d)

Interference between

neighboring cluster members

Fig. 6. An example of using “divide-and-conquer” to use cluster-basedhierarchical game in large-scale CSCs. In (c) and (d), the colors represent thechannels chosen by the small cells.

100 200 300 400 500 600 700 800 9000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Iterations needed for converging

Cum

ulat

ive

dist

ribut

ion

func

tion

(CD

F)

Hierarchical−game based Q−learning (N=80)

Hierarchical−game based Q−learning (N=50)

Simultaneous Q−learning (N=80)

Simultaneous Q−learning (N=50)

Fig. 7. The convergence speed comparison between the hierarchical-gamebased Q-learning and simultaneous Q-learning.

this, a useful method is clustering. An example of creatinghierarchy to use cluster-based hierarchical game in large-scaleCSCs is shown in Fig. 6. In (a), the network topology and theinterference relationship is presented. In (b), the neighboringsmall cells distributively form two disjoint clusters withcells3 and cell serving as the headers respectively. In (c), in theupper layer, the headers compete for resources with each other,aiming to maximize the aggregate utility of the cluster; in thelower layer, the cluster members compete with other members,under the policies imposed by the header. In (d), since differ-ent clusters behave independently, there may be interferencebetween neighboring clusters, e.g., cells 4 and 5 still interferewith each other. Thus, the interfering cells further mitigatemutual interference via distributed learning, e.g., Q-learning.With the proposed cluster-based hierarchical structure, the self-organizing optimization in large-scale networks can then besolved with moderate computational complexity.

We compare the computational complexity between theproposed cluster-based hierarchical game and the simultaneousQ-learning approach [13], in which all cells performs Q-learning simultaneously. The achievable network throughputof two approaches is almost the same. The cumulative dis-tribution function (CDF) of the iterations needed to converge

7

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1100

200

300

400

500

600

700

800

900

The user active probability (λ)

The

exp

ecte

d sy

stem

thro

ughp

ut (

Mbp

s)

Optimum (Genie−aided exhaustive search)Best NE of robust gameWorse NE of robust gameDistributed learning automata

Fig. 8. The expected Shannon capacity when varying the active probabilitiesof the cells.

is shown in Fig. 7. It is noted from the figure that for thesame size network, e.g.,N = 50 or N = 80, the iterationsneeded for converging of the hierarchical-game Q-learningapproach are significantly decreased. Furthermore, when thenetwork scales up fromN = 50 to N = 80, the convergencespeed of the hierarchical-game based Q-learning approachslightly decreased while that of the simultaneous Q-learningapproach is largely decreased. This implies that the proposedhierarchical-game based approach is especially suitable fordense and large-scale networks.

4) Robust game:To capture the random and dynamic behav-iors in CSCs, robust game is a good candidate. Specifically, theutility function in robust games is defined over statistics [14],e.g., expectation or other high-order statistics. In the following,we present a robust spectrum access game for CSCs as anillustrative example.

Consider a distributed CSC network operating in the TVwhite space. Each cognitive SAP inquires the spectrum avail-ability from the geo-location database, which specifies theavailable channel set and the maximum allowable transmissionpower for each cell. To capture the dynamic cell load inpractical applications, we consider a network with a varyingnumber of active cells. Specifically, it is assumed that eachcellperforms spectrum access with probabilityλn, 0 < λn ≤ 1, ineach decision period. Note that such a model captures generalkinds of dynamics in wireless networks, e.g., a cell becomesactive only when it has data to transmit and inactive whenthere is no transmission demand. Also, it can be regardedas an abstraction of the dynamic cell loading, i.e., the cellactive probability corresponds to the probability of a non-empty loading buffer. Note that the active cell set in eachperiod is not deterministic and randomly changes from periodto period. Also, a cell does not know the active probabilitiesof other cells.

To address the dynamic and random deployment of CSCs, arobust spectrum access game can be formulated, in which theutility function of a CSC is defined as the expected Shannoncapacity over all possible active cell sets. The game can beproved to be a potential game [9] and hence the distributed

learning automata algorithm [11] can be applied to convergetoNE points in the dynamic environment. Taking a network withnine CSCs as an illustrative example, the throughput perfor-mance comparison results are shown in Fig. 8. The optimumis obtained using the exhaustive search method in a centralizedmanner, by assuming that there is a genie that knows allrequired information. The best and worst NE is obtainedusing the best response algorithm in a distributed manner, byassuming that information exchange among neighboring cellsis available. Some important results can be observed: i) thebest NE is almost the same with the optimal one, while thethroughput gap between the worse NE and the optimum isalso trivial, which validates the effectiveness of the formulatedrobust spectrum game, ii) the achievable throughput of thedistributed learning automata is very close to the optimal one.

5) Content-aware game:As legacy cellular systems havemigrated towards data and Internet services, taking into ac-count the access content in the self-organizing optimizationwould enjoy content gain. For Internet traffic, it has beenshown in [15] that a relatively small proportion of the accessitems accounts for a vast fraction of the information accesses,and the Zipf’s law can be used to determine the occurrence fre-quency of the access items, given the content rank, the contentpool size, and the characteristic curve of the access pattern.Nowadays, content caching has become a core technology forwireless cellular systems. Thus, it is reasonable to replicatesignificant portions of popular contents on the wireless caches.As a result, the search and access time for popular content isfast compared with that of unpopular content. The reason isthat the popular content can be accessed in the wireless caches,while the unpopular content is accessed in the far-side server.Therefore, the differences in access time of different contentswill have great impact on wireless resource allocation and itis promising to explore content-aware game-theoretic solutionsfor CSCs, which would achieve better performance.

C. Comparative summarization and analysis

In comparison, the game-theoretic models for CSCs pre-sented in this article different from previous ones significantly.Specifically, it shifts from throughput-oriented optimization touser-centric optimization, e.g., demand-aware game, discrete-QoE-aware game and content-aware game, addresses the densedeployment of small cells, e.g., graphical game and hierarchi-cal game, and copes with randomness and dynamics well, e.g.,robust game. Although the research on game-theoretic self-organizing optimization for CSCs is in infancy, we believe thatthe presented game-theoretic models will draw great attentionin the near future.

For a specific resource optimization problem in CSCs, onecan choose a suitable game-theoretic model and a learningalgorithm to construct a self-organizing optimization solution.However, it should be pointed out that game-theoretic so-lution for CSCs is application-dependent, which means thatthe game-theoretic model and distributed learning algorithmshould be carefully formulated and designed.

8

D. Future research direction

It is seen that game-theoretic solutions for self-organizingoptimization in CSCs have definitely drawn a beautiful and ex-citing future, though the current research is still far awayfromthe expected vision. We list some future research problems forgame-theoretic models and learning procedures below:

1) Develop or investigate new game-theoretic models forself-organizing optimization from social/biological be-haviors. The rationale behind is that in old days hu-manity first self-organized and then evolved success-fully with population growth. For example, motivatedby the local altruistic behavior in biological systems,a local altruistic game with each player maximizingits utility and the aggregate utilities of its neighborswas proposed to achieve global optimization via localinformation exchange [10]. The key design of this issueis to properly abstract and model the social/biologicalbehaviors, which is interesting and challenging.

2) It is noted that each kind of presented game mainlyaddresses a single aspect of challenges in CSCs. How-ever, as can be expected, one may combine more thanone game-theoretic model, e.g., robust discrete-QoE-aware game or Stackelberg graphical game, to addressmultiple aspects of challenges of CSCs simultaneously.Such combinations bring about new challenges since thegame structure is completely changed.

3) Design and analyze heterogeneous learning algorithms.In most existing studies, it is assumed that all thedecision-makers employed the same learning algorithm.However, this assumption is for academic research butnot true in practical systems. In practice, the smallcells may belong to different holders, which may adoptdifferent learning algorithms; in addition, even the smallcells belonging to the same holder may have differentprocessing ability and preference, and hence chooseheterogeneous learning algorithms. Introducing hetero-geneity into the learning procedure will change theconvergence and asymptotic behavior, which needs tobe further studied.

4) Design knowledge-assisted learning algorithms. Thecommon procedure in existing learning algorithms isto update the strategies based on the historical action-payoff information. It may take long time to convergeto stable solutions since the players need to explore allthe possible actions. As shown in Fig. 1, knowledgecan be viewed as high-level intelligence obtained fromthe contextual information, which is truly beneficial fordecision-making. Thus, we should develop some newknowledge-assisted learning technologies to increase theconverging speed and achieve better performance.

V. CONCLUSION

In this article, we investigated self-organizing optimizationfor CSCs, which will play an important role in future cognitivecellular systems. By exploring the inherent features, somefundamental challenges and requirements for self-organizing

optimization in CSCs were presented and discussed. Follow-ing the attractive advantages of game-theoretic models, i.e.,distributed and autonomous decision-making, a frameworkof game-theoretic solutions for self-organizing optimizationin CSCs was established, and some featured game-theoreticmodels were proposed. Specifically, the basic game-theoreticmodels are presented, some insights are discussed, someexamples are discussed and future research directions aregiven.

REFERENCES

[1] H. Elsawy, E. Hossain and D. I. Kim, “Hetnets with cognitive smallcells: User offloading and distributed channel access techniques,” IEEECommun. Mag., vol. 51, no. 6, pp. 28-36, June 2013.

[2] Z. Zhang, K. Long, J. Wang, et al., “On swarm intelligenceinspiredself-organized networking: Its bionin mechanisms, designing principlesand optimization approaches,”IEEE Commun. Surveys Tuts., vol. 16,no. 1, pp. 513–537, Feb. 2014.

[3] M. Bennis, S. M. Perlaza, P. Blasco, et al., “Self-organization in smallcell networks: A reinforcement learning approach,”IEEE Trans. WirelessCommun., vol. 12, no. 7, pp. 3202-3212, July 2013.

[4] K. Zhu, E. Hossain, and D. Niyato, “Pricing, spectrum sharing, andservice selection in two-tier small cell networks: A hierarchical dynamicgame approach”,IEEE Trans. on Mobile Compu., vol. 13, no. 8,pp. 1843–1856, Aug. 2014.

[5] P. Semasinghe, E. Hossain, and K. Zhu, “An evolutionary game fordistributed resource allocation in self-organizing smallcells,” IEEETransactions on Mobile Computing, vol.14, no. 2, pp. 274-287, Jan.2015.

[6] J. A. Hassan, M. Hassan, S. K. Das, et al, “Managing quality ofexperience for wireless VOIP using noncooperative games,”IEEE J.Sel. Areas Commun., vol. 30, no. 7, pp. 1193-1204, July 2012.

[7] R. Myerson, Game Theory: Analysis of Conflict. Cambridge, MA:Harvard Univ. Press, 1991.

[8] A. Imran and L. Giupponi, “Use of learning, game theory and optimiza-tion as biomimetic approaches for self-organization in heterogeneousnetworks”, book chapter inCognitive Signals and CommunicationTechnology, pp. 237-268, Springer, 2014

[9] D. Monderer. and L. S. Shapley, “Potential games,”Games and Eco-nomic Behavior, vol. 14, pp. 124–143, 1996.

[10] Y. Xu, J. Wang, Q. Wu, et al, “Opportunistic spectrum access in cognitiveradio networks: Global optimization using local interaction games,”IEEE J. Sel. Signal Process, vol. 6, no. 2, pp. 180–194, April 2012.

[11] Y. Xu, J. Wang, Q. Wu, et al, “Opportunistic spectrum access inunknown dynamic environment: A game-theoretic stochasticlearningsolution”, IEEE Trans. Wireless Commun., vol. 11, no. 4, pp. 1380 –1391, April 2012.

[12] F. P. Kelly, “Charging and rate control for elastic traffic,” EUROPEANTransaction on Telecommunication, vol. 8, no. 1, pp. 33-37, Jan. 1997.

[13] M. Bennis, D. Niyato, “A Q-learning based approach to interferenceavoidance in self-organized femtocell networks,” inProc. of IEEEGLOBECOM Workshops, pp. 706-710, December 6-10, 2010.

[14] Y. Xu, Q. Wu, L. Shen, et al., “Robust multiuser sequential channelsensing and access in dynamic cognitive radio networks: Potentialgames and stochastic learning,”IEEE Trans. on Vehicular Technology,to appear.

[15] P. Marshall, Scalability, density, and decision making in cognitivewireless networks. Cambridge University Press, 2012.

A Game Theoretic Perspective on Self-organizing ... · Index Terms—5G, cognitive small cells,...

Documents

Transcript of A Game Theoretic Perspective on Self-organizing ... · Index Terms—5G, cognitive small cells,...