Efficient Online Portfolio with Logarithmic Regret05... · Efficient Online Portfolio with...
Transcript of Efficient Online Portfolio with Logarithmic Regret05... · Efficient Online Portfolio with...
Efficient Online Portfolio with Logarithmic Regret
Haipeng Luo (USC)Chen-Yu Wei (USC)Kai Zheng (Peking University)
Online Portfolio
Wealth𝑡
Online Portfolio
Wealth𝑡
0.5Wealth𝑡
0.3Wealth𝑡
0.2Wealth𝑡
Online Portfolio
Wealth𝑡
0.5Wealth𝑡
0.3Wealth𝑡
0.2Wealth𝑡
× 1.4
× 1.0
× 0.5
Online Portfolio
Wealth𝑡
0.5Wealth𝑡
0.3Wealth𝑡
0.2Wealth𝑡
0.7Wealth𝑡
0.3Wealth𝑡
0.1Wealth𝑡
× 1.4
× 1.0
× 0.5
Online Portfolio
Wealth𝑡+1= 0.7 + 0.3 + 0.1 Wealth𝑡= 1.1Wealth𝑡
Wealth𝑡
0.5Wealth𝑡
0.3Wealth𝑡
0.2Wealth𝑡
0.7Wealth𝑡
0.3Wealth𝑡
0.1Wealth𝑡
× 1.4
× 1.0
× 0.5
Online Portfolio
Wealth𝑡
𝑥𝑡 (decision)
0.5Wealth𝑡
0.3Wealth𝑡
0.2Wealth𝑡
0.7Wealth𝑡
0.3Wealth𝑡
0.1Wealth𝑡
× 1.4
× 1.0
× 0.5
Wealth𝑡+1= 0.7 + 0.3 + 0.1 Wealth𝑡= 1.1Wealth𝑡
Online Portfolio
Wealth𝑡+1= 0.7 + 0.3 + 0.1 Wealth𝑡= 1.1Wealth𝑡
Wealth𝑡
𝑥𝑡 (decision) 𝑟𝑡 (price relative)
0.5Wealth𝑡
0.3Wealth𝑡
0.2Wealth𝑡
0.7Wealth𝑡
0.3Wealth𝑡
0.1Wealth𝑡
× 1.4
× 1.0
× 0.5
Online Portfolio
Wealth𝑡+1= 0.7 + 0.3 + 0.1 Wealth𝑡= 1.1Wealth𝑡= 𝑥𝑡, 𝑟𝑡 Wealth𝑡
Wealth𝑡
𝑥𝑡 (decision) 𝑟𝑡 (price relative)
0.5Wealth𝑡
0.3Wealth𝑡
0.2Wealth𝑡
0.7Wealth𝑡
0.3Wealth𝑡
0.1Wealth𝑡
× 1.4
× 1.0
× 0.5
Online Portfolio
𝑊1
𝑇 periods
Online Portfolio
𝑊1 𝑊2= 𝑥1, 𝑟1 𝑊1
𝑇 periods
Online Portfolio
𝑊1 𝑊2= 𝑥1, 𝑟1 𝑊1
𝑊3= 𝑥2, 𝑟2 𝑊2
𝑇 periods
Online Portfolio
𝑊1 𝑊2= 𝑥1, 𝑟1 𝑊1
𝑊3= 𝑥2, 𝑟2 𝑊2
𝑇 periods
𝑊𝑇+1𝑊1=
𝑡=1
𝑇
𝑥𝑡 , 𝑟𝑡Final wealth
Initial wealth
Online Portfolio
Gain:
Online Portfolio
Gain:
Benchmark:
Online Portfolio
Gain:
Benchmark:
Minimize (Regret)
Online Portfolio
Gain:
Online Convex Optimization [Zinkevich’03]Benchmark:
Minimize (Regret)
Online Portfolio
Gain:
Online Convex Optimization [Zinkevich’03]Benchmark:
Minimize
Maximum Relative Ratio
𝛻ℓ𝑡 𝑥 ∞ ≾ 𝐺 ≜ max𝑖,𝑗
𝑟𝑡,𝑖𝑟𝑡,𝑗
(Regret)
But with possibly unbounded gradient
Previous Results and Our Results
• Lower bound: Ω 𝑁 log 𝑇 𝑁: number of stocks
𝑇: number of rounds
Previous Results and Our Results
• Lower bound: Ω 𝑁 log 𝑇
Algorithm Regret Time (/round)
Universal Portfolio (Cover 1991, Kalai et al. 2002)
𝑁 log 𝑇 𝑇14𝑁4
• Upper bounds:
𝑁: number of stocks
𝑇: number of rounds
𝐺: maximum relative ratio
Previous Results and Our Results
• Lower bound: Ω 𝑁 log 𝑇
Algorithm Regret Time (/round)
Universal Portfolio (Cover 1991, Kalai et al. 2002)
𝑁 log 𝑇 𝑇14𝑁4
• Upper bounds:
𝑁: number of stocks
𝑇: number of rounds
𝐺: maximum relative ratio
Previous Results and Our Results
• Lower bound: Ω 𝑁 log 𝑇
Algorithm Regret Time (/round)
Universal Portfolio (Cover 1991, Kalai et al. 2002)
𝑁 log 𝑇 𝑇14𝑁4
ONS (Hazan et al. 2007) 𝐺𝑁 log 𝑇 𝑁3.5
• Upper bounds:
𝑁: number of stocks
𝑇: number of rounds
𝐺: maximum relative ratio
Previous Results and Our Results
• Lower bound: Ω 𝑁 log 𝑇
Algorithm Regret Time (/round)
Universal Portfolio (Cover 1991, Kalai et al. 2002)
𝑁 log 𝑇 𝑇14𝑁4
ONS (Hazan et al. 2007) 𝐺𝑁 log 𝑇 𝑁3.5
• Upper bounds:
𝑁: number of stocks
𝑇: number of rounds
𝐺: maximum relative ratio
Previous Results and Our Results
• Lower bound: Ω 𝑁 log 𝑇
Algorithm Regret Time (/round)
Universal Portfolio (Cover 1991, Kalai et al. 2002)
𝑁 log 𝑇 𝑇14𝑁4
ONS (Hazan et al. 2007) 𝐺𝑁 log 𝑇 𝑁3.5
Soft-Bayes (Orseau et al. 2017) 𝑇𝑁 𝑁
• Upper bounds:
𝑁: number of stocks
𝑇: number of rounds
𝐺: maximum relative ratio
Previous Results and Our Results
• Lower bound: Ω 𝑁 log 𝑇
Algorithm Regret Time (/round)
Universal Portfolio (Cover 1991, Kalai et al. 2002)
𝑁 log 𝑇 𝑇14𝑁4
ONS (Hazan et al. 2007) 𝐺𝑁 log 𝑇 𝑁3.5
Soft-Bayes (Orseau et al. 2017) 𝑇𝑁 𝑁
• Upper bounds:
𝑁: number of stocks
𝑇: number of rounds
𝐺: maximum relative ratio
Previous Results and Our Results
• Lower bound: Ω 𝑁 log 𝑇
Algorithm Regret Time (/round)
Universal Portfolio (Cover 1991, Kalai et al. 2002)
𝑁 log 𝑇 𝑇14𝑁4
ONS (Hazan et al. 2007) 𝐺𝑁 log 𝑇 𝑁3.5
Soft-Bayes (Orseau et al. 2017) 𝑇𝑁 𝑁
? ≈ 𝑁 log 𝑇 ≈ 𝑁
• Upper bounds:
𝑁: number of stocks
𝑇: number of rounds
𝐺: maximum relative ratio
Previous Results and Our Results
• Lower bound: Ω 𝑁 log 𝑇
Algorithm Regret Time (/round)
Universal Portfolio (Cover 1991, Kalai et al. 2002)
𝑁 log 𝑇 𝑇14𝑁4
ONS (Hazan et al. 2007) 𝐺𝑁 log 𝑇 𝑁3.5
Soft-Bayes (Orseau et al. 2017) 𝑇𝑁 𝑁
? ≈ 𝑁 log 𝑇 ≈ 𝑁
BarrONS (this work) 𝑁2 log 𝑇 4 𝑇𝑁2.5
• Upper bounds:
𝑁: number of stocks
𝑇: number of rounds
𝐺: maximum relative ratio
Key Components of Our Algorithm
bad bad suddenly good
But player puts little weight on it
Main Challenge:
Key Components of Our Algorithm
Barrons (Barrier-Regularized-ONS) compared to ONS:
bad bad suddenly good
But player puts little weight on it
Main Challenge:
Key Components of Our Algorithm
Barrons (Barrier-Regularized-ONS) compared to ONS:
1. Additional regularizer (to avoid too extreme distribution over stocks)
bad bad suddenly good
But player puts little weight on it
Main Challenge:
Key Components of Our Algorithm
Barrons (Barrier-Regularized-ONS) compared to ONS:
1. Additional regularizer (to avoid too extreme distribution over stocks)
2. Increase the learning rate for worse stocks (faster recovery)
bad bad suddenly good
But player puts little weight on it
Main Challenge:
Key Components of Our Algorithm
Barrons (Barrier-Regularized-ONS) compared to ONS:
1. Additional regularizer (to avoid too extreme distribution over stocks)
2. Increase the learning rate for worse stocks (faster recovery)
3. Restarting (adapting to maximum relative ratio)
bad bad suddenly good
But player puts little weight on it
Main Challenge:
Key Components of Our Algorithm
Barrons (Barrier-Regularized-ONS) compared to ONS:
1. Additional regularizer (to avoid too extreme distribution over stocks)
2. Increase the learning rate for worse stocks (faster recovery)
3. Restarting (adapting to maximum relative ratio)
bad bad suddenly good
But player puts little weight on it
Main Challenge:
Poster #157