Post on 11-Apr-2018
TED AND KARYN HUME CENTER FOR
NATIONAL SECURITY AND TECHNOLOGY
An Anti-Jamming Strategy for Channel Access in Cognitive Radio Networks
Shabnam Sodagari and T. Charles Clancy
shabnams@vt.edu
tcc@vt.edu
http://www.hume.ictas.vt.edu
Cognitive Radio (CR)
Motivation
Primary Base Station
Primary User
CR
CR Base Station
Like any communication network, cognitive radio networks (CRNs) are vulnerable to jamming attacks, to prevent them from utilizing spectrum opportunities.
Problem Modeling
Jammer can do spectrum sensing too
Jams j channels at random from the total channels released to the secondary
In each time slot t secondary user (SU) tries to avoid interference to the primary user (PU) by spectrum sensing
Problem Modeling Random jamming strategy helps the jammer against a smart SU. We assume SU can select one channel in each time slot and does not want to select a channel under attack. We present the best strategy for the SU to maximize its throughput over time. We assume the jammer does not jam channels while PU is active and only targets the SU’s access.
In case PU can recognize and punish
jammers, or attacker distant from the primary, or jammer has no incentive to disrupt PU
access
Solution
Formulate as multi-armed bandit process
given unknown rewards of multiple levers, the player (here the SU) tries to pull the most rewarding lever at each time slot
with the goal of maximizing its total reward over the time horizon.
levers associated with available sub-channels
reward proportional to the SNR of the SU over the chosen sub-channel.
Formulation
Total of C sub-channels, each sub-channel c can be in one of K states, corresponding to SNR levels
denotes sub-channel c is under attack
Set of time-variant transition probabilities for each state is
suppose sub-channel c in state and is selected next after rounds, reward
1 2{ , ,..., }K
c c c cp p p p
1 0cp
{1,2,..., }k K ( , , )cq k j t
( , , ) 1c
j k
q k j t
k
cp
1t k
cr
Goal of CR User
Finding a sub-channel selection strategy at each time-slot such that infinite time-horizon reward that is expressed in terms of selected channel’s SNR is maximized.
( ) if
0 else
cS c C
Nr
Solution: Solve Whittle’s linear Program
and denote probabilities in the optimal policy, that sub-channel c in state is selected or not selected t time steps after it was last accessed.
kc
c
p tx k
c
c
p ty
k
c cp p
Formulation
Core of semi-uniform strategies:
Pulling the best lever greedily, except when a uniformly random action is taken.
For example, in epsilon-first strategy, there are distinct exploitation and exploration phases, with exploration ε% of the time and exploitation the rest (1-ε)% of shots
Epsilon-greedy strategy: the best lever is selected for a proportion 1 − ε of the trials and another lever is randomly selected with uniform probability with proportion ε
State machine of semi-uniform multi-armed bandit strategies
Epsilon-Decreasing and Adaptive Epsilon-Greedy Strategies
When the value of ε in an epsilon-greedy strategy is decreasing as the number of experiments increases, the strategy is called epsilon-decreasing
A solution can be the adaptive epsilon-greedy strategy, which adjusts ε based on reinforcement learning by keeping track of the reward differences during experiments, i.e., high changes in the reward enforce a higher ε or more exploration than exploitation.
Numerical Results
Monte Carlo simulations in MATLAB
-For various number of idle channels provide by PU
-varying number of jammed sub-channels
Goal: to show our method leads to better overall SNR
-help SU avoid jammed sub-channels
-0 SNR on jammed channels
- 5 to 20 dB SNR on non-jammed channels
- To come up with best exploration vs. exploitation phase length
Comparison of average SNR obtained using ε-greedy method and the random scheme
More than 5 dB SNR improvement
Effect of exploration phase length on average SNR obtained using ε-greedy method
Takeaway: Exploration phase up to 5% is enough
Comparison of average SNR obtained using ε-first, random and ε-greedy methods
Takeaway: With 1000 instances of completed plays and 33% jammed sub-channels, ε-first and ε-greedy approach same performance as number of sub-channels grow
Effect of exploration phase length on average SNR obtained using ε-first method
Takeaway: exploration phase longer than 5% of rounds degrades overall results
Average SNR over selected channels in ε-greedy method vs. varying number of sub-channels and jammed sub-channels
Takeaway: Our method always selects sub-channels with more than 5 dB improvement in SNR
Conclusion Presented an anti-jamming strategy for secondary users for DSA Application to CR: against jammers manipulation of spectrum sensing phase
Thank You
TED AND KARYN HUME CENTER FOR
NATIONAL SECURITY AND TECHNOLOGY