Learning Optimal Sniffer Channel Assignment for Small Cell ...

17
Learning Optimal Sniffer Channel Assignment for Small Cell Cognitive Radio Networks Lixing Chen 1 , Zhuo Lu 2 , Pan Zhou 3 , Jie Xu 1 1 University of Miami, FL, USA 2 University of South Florida, FL, USA 3 Huazhong University and Science and Technology, Wuhan, China

Transcript of Learning Optimal Sniffer Channel Assignment for Small Cell ...

PowerPoint PresentationLearning Optimal Sniffer Channel Assignment for Small Cell Cognitive Radio Networks
Lixing Chen1, Zhuo Lu2, Pan Zhou3, Jie Xu1
1University of Miami, FL, USA 2University of South Florida, FL, USA
3Huazhong University and Science and Technology, Wuhan, China
Introduction
2
• Capture network traffic to analyze the network conditions and performance.
• Network operations: resource management, network configuration, fault diagnosis,
network intrusion detection
e
3
• Imperfect monitoring, unreliability of mmWave propagation
CH2 CH3CH1 CH1 CH2 CH3CH2CH1
CH2 CH3CH1
• Multi-cell scenario with assignment constraints
• Time-varying spectrum resource
Online Sniffer Channel Assignment using Bandit Learning
• Learn the knowledge about imperfect monitoring
• Contextual Combinatorial Multi-armed bandit
5
Sniffer Channel Assignment • Assign a set of sniffers = {1,2, … , } to a set of channels
• Assignment decision ∈,
∈ ⊆
• Importance of channel , ∈
- Amount of traffics on the channel
- Time occupied by licensed users and unlicensed users
• Packet capture probability (; )
- Assignment decision : number of sniffers assigned to channel
- = ,
∈,∈ denotes the performance of sniffer on channel
a broadcasting
, = Pr{SINRsniffer,
Redundant Sniffer Assignment
0, = ∅
- Solve in each time slot t :
• Result
Sniffer Channel with Oracle Information
7
(; ) , s. t.
∈ ∪ {}
2 (opt,; )
8
• Algorithm: Online Sniffer Channel Assignment (OSA)
- Solve a long-term problem in a time horizon 1,2,..,T :
• Learning while optimizing
- Task2: Maximize the utility based on the learned knowledge
- Trade-off between two purposes, balanced by Multi-armed bandit
max 1,2,…
∪ , ∀
9
- Arm/Action: sniffer-channel pairs
• Objective
- Exploitation: pull an arm that yielded highest reward
according to past experience
• Learn general rewards
Pull an arm
1 2 3
1 2 3.…
≥ ,
t of sniffers is observable (denoted by , ∈ Φ)
• Context-parameterized non-outage probability
- , ∼ ,(,
- , , [,(,
)]
- Estimate the expected value → , ,
• Context partitioning
- Continuous context Φ [0,1] → discrete context intervals - Similarity assumption: Similar context → Similar non-outage probability
Sniffer s SBS n 0 1
1
- Counter , records amount of collected data
- ,() stores the observed non-outage probabilities
- Estimated non-outage probability , =
1
11
Normalized SINR
2 0.82
3 0.26
6 0.58
7 0.27
8 0.51
SBS 1
SBS 2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
,3 ,2
+ 1
}
12
• Context observation
- Each sniffer s senses SINRs , on accessible channels ∈ ,
- Find , such that ,
∈ ,
, , < ()}
• Exploration
- ue, = ∈ s , ≠ ∅}, randomly assign sniffer ∈ ue, to a channel in s
ue,
• Exploitation
max

∪ , ∀ ∈ ed,
Determine whether the
estimation is accurate
Online Learning via Multi-armed Bandit
13
= σ=1 (opt,; ) - σ=1
(; )
3+1 log() and = 1
3+1 , the upper bound of [()] is
(2 2+1
• 25 sniffers (red squares) with grid layout
• Randomly deployed Users (yellow dots)
• Background color is the user density
Factors affects SINR
• Interferences (# of nearby users)
• UCB: a classic MAB algorithm, non-contextual
and non-combinatorial
reward is a linear function of context
• Random: takes random assignment decisions
Cumulative rewards
Fig. Comparison of regrets
• OSA with Non-Redundant Assignments (OSA-NRA)
Rewards and Regret
assignment is beneficial
Fig. Comparison of regret