Outline - Hong Kong University of Science and Technology
Transcript of Outline - Hong Kong University of Science and Technology
![Page 1: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/1.jpg)
1�
![Page 2: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/2.jpg)
Outline �
2�
Introduction and Motivation
Survey of Existing Approaches
Example:
Distributive Delay-Optimal Control for Uplink OFDMA via
Localized Stochastic Learning and Auction Game
Convergence Analysis
Asymptotic Optimality
Conclusion
![Page 3: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/3.jpg)
Introduction and Motivation �
3�
Why delay performance is important? “WHAT??!!Heisstuckinthe
air!!!$*(&#%*!(!”
“Youmustbekiddingme!Bufferingatsuchanimportantmoment!!??”�
![Page 4: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/4.jpg)
Introduction and Motivations �
4�
We may have multiple delay-sensitive wireless applications running at different devices
Keeptrackofagame�
PlaymulI‐playergame�
Keeptalkingtosomefriends �
![Page 5: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/5.jpg)
Related Works�
5�
OFDMA Joint Power and Subband Design for PHY
Performance
[Yu’02],[Hoo’04],[Seong’06],etc.– Selectsthestrongestuserpersubband– Time‐FrequencyWater‐fillingPowerAllocaIon– AssumingknowledgeofperfectCSIT.
[Lau’05],[Wong’09],[Brah’07]etc.– Robust Power and Subband Control with limited
feedbackoroutdatedCSIT(packeterrors). �
![Page 6: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/6.jpg)
Introduction and Motivations �
6�
Challenges to incorporate QSI and CSI in adaptation
When Shannon meets Kleinrock… �
ClaudeShannon � LeonardKleinrock�
![Page 7: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/7.jpg)
Existing Approaches to deal with Delay-Optimal Control�
7�
Various approaches dealing with delay problems
BufferStates
Toregulatethebufferstatetowards1/v
S<1/v S>1/v
v ‐vBufferParNNoning
![Page 8: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/8.jpg)
Related Works�
8�
Various approaches dealing with delay problems ApproachII[Yeh’01PhD],[Yeh’03ISIT]
‐SymmetricandhomogeneoususersinmulI‐accessfadingchannels
‐UsingstochasNcmajorizaNontheory,theauthorsshowedthatthelongestqueuehighestpossiblerate(LQHPR)policyisdelay‐opImal �
A
BCapacityregion
Longerqueueforuser1
higherrateforuser1
![Page 9: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/9.jpg)
Related Works�
9�
Various approaches dealing with delay problems
![Page 10: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/10.jpg)
Related Works�
10�
Various approaches dealing with delay problems
![Page 11: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/11.jpg)
Technical Challenges To be Solved�
11�
![Page 12: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/12.jpg)
12�
![Page 13: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/13.jpg)
Uplink OFDMA System Model�
![Page 14: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/14.jpg)
H
OFDMA PHY Model�
14�
OFDMA Physical Layer Model
OFD
MA
SubbandAllocaN
on&pow
erControl�
CSI
MobileK�The image cannot be displayed. Your computer may not
The image cannot be Mobile1�
BS�
E[XXH ] = I
H = {Hk,n}
OFDMAPHYSubcarrier&PowerAllocaIon
DataRateRk
![Page 15: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/15.jpg)
Source Model and System States�
15�
G‐MAPPackets
YouTubePackets
CSI�
QSI�
CrossLayerController
(BS)
PHYState
MACLayer
G‐MAPPackets
PHYLayer
Power&SubbandAllocaNon
Ime�
PacketArrivals
PHYFrames
schedulingNmeslot Channelisquasi‐staNcinaslot i.i.d.betweenslots
MACState
YouTubePackets
![Page 16: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/16.jpg)
OFDMA Queue Dynamics�
16�
Time domain partitioned into scheduling slots
CSI H(t) remains quasi-static within a slot and is i.i.d.
between slots
Packet arrival A(t)=(A1(t) ,…,AK(t)) where Ak (t) i.i.d.
according to a general distribution P(A).
Nk(t) denotes the random packet size, i.i.d.
Qk(t) denotes the number of packets waiting in the k-th
buffer at the t-th slot.
Global System State (CSI, QSI) TotalnumberofbitsTransmi`edinthet‐thslot
![Page 17: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/17.jpg)
OFDMA Delay-Optimal Formulation�
17�
Stationary Power and Subband Allocation Control Policy
A mapping from the system state to a power
and subband allocation actions.
(Power Constraint)
(Subband Allocation Constraint)
(Packet Drop Rate Constraint)
![Page 18: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/18.jpg)
OFDMA Delay-Optimal Formulation�
18�
Definitions: Average Delay, Power and Packet Drop Constraints
under a control policy
Li`le’sLaw:averageno.ofpackets=averagearrivalrate*averagedelaytheaveragedelay(intermsofseconds)theaveragequeuelength
![Page 19: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/19.jpg)
OFDMA Delay-Optimal Formulation�
19�
Problem Formulation: Find the optimal control policy that minimizes
“PosiIveWeighIngFactor”
ParetoOpImaldelayboundary
Why the Optimization Problem is difficult? – Hugedimensionofvariablesinvolved
(policy=setofacIonsoverallsystemstaterealizaIons)– KqueuesarecoupledtogetherExponenIallyLargeStateSpace– Ingeneral,wecannothaveexplicitclosed‐formexpressionofhowthe
objecIvefuncIon(averagedelay)isrelatedtothecontrolvariables(policy).
– Theproblemisnotconvex
Solution: Markov Decision Problem (MDP)
![Page 20: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/20.jpg)
Overview of Markov Decision Problem Formulation �
20�
Specification of an Infinite Horizon Markov Decision Problem
– DecisionsaremadeatpointsofNme–decisionepochs
– SystemstateandControlAcNonSpace:
– Atthet‐thdecisionepoch,thesystemoccupiesastate
– ThecontrollerobservesthecurrentstateandappliesanacIon– Per‐stageReward&TransiNonProbability
– BychoosingacIonthesystemreceivesareward
– ThesystemstateatthenextepochisdeterminedbyatransiIonprobabilitykernel
– StaNonaryControlPolicy:– ThesetofacIonsforallsystemstaterealizaIons
– TheOpNmizaNonProblem:
– AverageReward– OpImalPolicy
R∗ = max
πlim
T→∞
1T
E[
T∑
t=1
R(St, At)
]At = π(St)
![Page 21: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/21.jpg)
Solution of an Markov Decision Problem
Optimal average reward
Optimal policy (Fixed Point Problem on Functional Space)
Overview of Markov Decision Problem Formulation �
21�
![Page 22: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/22.jpg)
Constrained Markov Decision Problem Formulation �
22�
Lagrangian approach to the Constrained MDP:
CMDP Formulation: Find the optimal control policy that minimizes
![Page 23: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/23.jpg)
Optimal Solution �
23�
Infinite Horizon Average Reward MDP
Given a stationary control policy ,
he random process evolves like a Markov Chain
with transition kernel:
Solution is given by the “Bellman Equation”
“PotenIalfuncIon”(contribuIonofthestateitotheaveragereward)
“OpImalValue” EquaIonsand unknowns
![Page 24: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/24.jpg)
Centralized Solution ?
Obtain knowledge of global QSI from K users (Uplink)?
Heavy signaling loading to deliver these QSI from mobiles to BS
Must have distributive solution !
Optimal Solution – Online Learning �
24�
How to determine the potential function ?
Brute-Force solution of the Bellman Equation ? (Value Iteration):
Too complicated, exponential complexity and memory requirement
Online stochastic learning !
Iteratively estimate potential function based on real time
observation of CSI and QSI – online value iteration
Per-user Potential and LMs Initialization
Online Policy Improvement Based on Per-subband Auction
Online Per-user Potential and LMs Update [Local CSI, Local QSI]
Termination
Distributive Solution:
![Page 25: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/25.jpg)
Decentralized Solution (I) �
25�
Online Per-user Primal-Dual Potential Learning Algorithm via
Stochastic Approximation
Remark (Comparison to the deterministic NUM) Deterministic NUM:IteraIveupdatesareperformedwithintheCSIcoherenceImelimitthenumberofiteraIonsandtheperformance.Proposed online algorithm:IteraIveupdatesevolvesinthesameImescaleastheCSIandQSIconvergetoabe`ersoluIon(nolongerlimitedbythecoherenceImeofCSI).
Both the per-user potential and 2 LMs
are updated simultaneously.
New Observation at the beginning of the (l+1)-th slot
![Page 26: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/26.jpg)
Decentralized Solution (II) �
26�
Per-stage auction with K bidders (MSs) and one auctioneer
(BS)
Low complexity Scalarized Per-Subband Auction
Bidding: Each user submits a bid
Subband allocation:
Power allocation:
Charging:
Lemma: The per-stage social optimal scalarized bid
(CSI,QSI) is Water‐leveldependsonQSI(viapotenIalfuncIon)
![Page 27: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/27.jpg)
Decentralized Solution �
27�
Theorem (Convergence of online per-user learning) Under
some mild conditions, the distributive learning converges
almost surely.
Theorem (Asymptotically Global Optimal) For large K, the
online per-user learning algorithm is asymptotical global
optimal, and the summation of the per-user potential
approaches (w.p.1) to the solution of the centralized
Bellman equation.
Remark (Comparison to conventional stochastic learning) Conventional SL:(1)forunconstrainedMDPonlyorLMforCMDParedeterminedofflinebysimulaIon;(2)designedforcentralizedsoluIonwithcontrolacIondeterminedenIrelyfromthepotenIalupdateConvergenceProofbasedonstandard“contrac(onMapping”andFixed‐PointTheoremargument.Proposed SL:(1)simultaneousupdateofLMandthepotenIalfuncIon;(2)controlacIonisdeterminedbyalltheusers’potenIalviaper‐stageaucIonper‐userpotenIalupdateisNOTacontrac(onmapping&standardproofdoesnotapply.
![Page 28: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/28.jpg)
Numerical Results�
28�
Average Delay per user vs SNR
Close‐to‐opImalperformanceevenforsmallnumberofusers
HugegainindelayperformancecomparedwithModified‐LargestWeightedDelayFirst(M‐LWDF),whichisthequeuelengthweightedthroughputmaximizaIon.
![Page 29: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/29.jpg)
Numerical Results�
29�
Average Delay per user vs No. of users
ThedistribuIvesoluIonhashugegainindelayperformancecomparedwith3Baselines.
![Page 30: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/30.jpg)
Numerical Results�
30�
Illustration of convergence property: Potential function vs. the scheduling slot index (K=10)
![Page 31: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/31.jpg)
Conclusion �
31�
Online Per-user Learning: Simultaneous update of LMs and Potentials. Almost sure convergence
Asymptotically Global Optimal for large K
Optimal Strategy for the Auction Game: Delay-Optimal Power Control: Multi-Level Water-Filling (QSI water level; CSI instantaneous allocation) Delay-Optimal Subband Allocation: User selection based on (QSI,CSI)
![Page 32: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/32.jpg)
References �
32�
• V.K.N.Lau,Y.Chen,“Delay‐Op(malPrecoderDesignforMul(‐StreamMIMOSystem”,toappearIEEETransac;onsonWirelessCommunica;ons,May2009.
• V.K.N.Lau,Y.Cui,“DelayOp(malPowerandSubcarrierAlloca(onforOFDMASystemviaStochas(cApproxima(on”,submi`edtoIEEETransacIonsonWirelessCommunicaIon,2008.
• K.B.Huang,V.K.N.Lau,“StabilityandDelayofZero‐ForcingSDMAwithLimitedFeedback",submi`edtoIEEETransacIonsonInformaIonTheory,Feb.2009.
• L.Z.Ruan,V.K.N.Lau,“Mul(‐levelWater‐FillingPowerControlforDelay‐Op(malSDMASystems”,submi`edtoIEEETransacIonsonWirelessCommunicaIon,2008.
![Page 33: Outline - Hong Kong University of Science and Technology](https://reader031.fdocuments.in/reader031/viewer/2022013015/61cff5b839f9b464f4168723/html5/thumbnails/33.jpg)
33�