On-Line Portfolio Selection Using Multiplicative Updates Written by David P. Helmbold (Cal), Robert...

On-Line Portfolio On-Line Portfolio Selection Using Selection Using Multiplicative Multiplicative

UpdatesUpdatesWritten by David P. Helmbold (Written by David P. Helmbold (CalCal), ),

Robert E. Schapire (Robert E. Schapire (CalCal), Yoram Singer ), Yoram Singer ((AT&TAT&T) and Manfred K. Warmuth () and Manfred K. Warmuth (CalCal))

Presented by Ryan M. McCabePresented by Ryan M. McCabe

GoalGoal

Within a menu of a fixed number of Within a menu of a fixed number of stocks, we want to make as much stocks, we want to make as much money as possible without relying too money as possible without relying too much on luckmuch on luck

We’ll compare our results to how well We’ll compare our results to how well the best single stock, another form of the best single stock, another form of on-line learning (Cover) and a batch on-line learning (Cover) and a batch learner (BCRP) each performedlearner (BCRP) each performed

ContextContext

Remember, this is on-line learningRemember, this is on-line learning Unlike batch learning, the data is Unlike batch learning, the data is

coming to us in a stream, and we coming to us in a stream, and we learn from each examplelearn from each example

Still, we do not want to completely Still, we do not want to completely ignore what we have learned from ignore what we have learned from historyhistory

More ContextMore Context

We have a bunch of stocksWe have a bunch of stocks We have some wealthWe have some wealth Every day we get a report on the Every day we get a report on the

stocksstocks Every day we update our current Every day we update our current

wealth, based on their performance wealth, based on their performance yesterdayyesterday

Every day we re-allocate our wealth Every day we re-allocate our wealth over the stocksover the stocks

PreliminariesPreliminaries

We have N stocksWe have N stocks w is a vector of weights over N w is a vector of weights over N

stocksstocks wwii from i = 1 to N, sums to 1 from i = 1 to N, sums to 1

every wevery wii >= 0 >= 0

We have T total time, superscript We have T total time, superscript tt denotes a specific timedenotes a specific time

PreliminariesPreliminaries

wwtt is the vector of weights at time is the vector of weights at time tt wwtt is chosen at the beginning of day is chosen at the beginning of day tt

xxtt is the vector of relative performance is the vector of relative performance of all the stocks over the course of day of all the stocks over the course of day tt xxtt = closing price on = closing price on tt / opening price at / opening price at tt

The wealth resulting from day The wealth resulting from day tt is w is wtt * * xxtt

We change wWe change wtt every day in some way every day in some way

Follow-UpsFollow-Ups

If we have time at the end of this If we have time at the end of this presentation, we’ll talk about some presentation, we’ll talk about some things of practical importancethings of practical importance Transaction costsTransaction costs Side informationSide information Implementation detailsImplementation details

Four Types of Portfolio Four Types of Portfolio MangersMangers

(Best) Constant-Rebalanced Portfolio(Best) Constant-Rebalanced Portfolio Cover Universal PortfolioCover Universal Portfolio Exact Exponentiated Gradient Exact Exponentiated Gradient

(ExactEG((ExactEG()))) Approximate Exponential Gradient Approximate Exponential Gradient

(EG((EG())))

Constant-Rebalanced Constant-Rebalanced PortfoliosPortfolios

In a CRP wIn a CRP wtt is learned over all T by looking is learned over all T by looking back over the data (this is our batch back over the data (this is our batch method)method)

Although the wealth is redistributed every Although the wealth is redistributed every day over the N stocks, wday over the N stocks, wtt stays the same stays the same from 1…Tfrom 1…T

w* denotes the ww* denotes the wtt that maximizes wealth that maximizes wealth over the given set of xover the given set of xtt from 1…T from 1…T

w* is associated with the Best Constant-w* is associated with the Best Constant-Rebalanced Portfolio (BCRP)Rebalanced Portfolio (BCRP)

Cover Universal PortfolioCover Universal Portfolio

Another on-line methodAnother on-line method wwtt is updated every day is updated every day wwtt is a weighted average over all is a weighted average over all

feasible portfoliosfeasible portfolios Guarantees the same asymptotic Guarantees the same asymptotic

growth rate as BCRP for any given growth rate as BCRP for any given set of xset of xtt

Exponential complexity in NExponential complexity in N

Exact Exponentiated Exact Exponentiated GradientGradient

Remember on-line regression?Remember on-line regression? F(wF(wt+1t+1) = ) = log(w log(wt+1t+1 * x * xtt) – d(w) – d(wt+1t+1, w, wtt))

Maximize F(wMaximize F(wt+1t+1) over w) over wt+1t+1, given w, given wtt and x and xtt

log(wlog(wt+1t+1 * x * xtt), maximizes wealth if x), maximizes wealth if xtt stays stays stillstill

d(wd(wt+1t+1, w, wtt), penalizes moving too far from w), penalizes moving too far from wtt

, learning rate - shifts importance between , learning rate - shifts importance between main two termsmain two terms

But F(wBut F(wt+1t+1) is difficult to maximize) is difficult to maximize

How do we learn wHow do we learn wtt??

So we use an approximationSo we use an approximation Using a first-order Taylor Using a first-order Taylor

approximation of the first term at approximation of the first term at wwt+1t+1 = w = wtt and a relative entropy and a relative entropy distance measure for the second distance measure for the second penalty term, waving some hands, penalty term, waving some hands, we get the EG(we get the EG() update:) update:

Exponential Gradient Exponential Gradient UpdateUpdate

This approximate version performs This approximate version performs indistinguishably as well as the indistinguishably as well as the original Exact EG(original Exact EG() = F(w) = F(wt+1t+1) = ) = log(wlog(wt+1t+1 * x * xtt) – d(w) – d(wt+1t+1, w, wtt))

It is only linearly complex in NIt is only linearly complex in N

Quick ReCapQuick ReCap

So now we have defined our four So now we have defined our four methodsmethods Best Constant-Rebalanced Portfolio (BCRP)Best Constant-Rebalanced Portfolio (BCRP) Cover Universal On-Line PortfolioCover Universal On-Line Portfolio Exact EG(Exact EG()) Common EG(Common EG())

Let’s see how they perform under pressure…Let’s see how they perform under pressure…

The ExperimentsThe Experiments

22 years of NYSE data (T > 5,000)22 years of NYSE data (T > 5,000) 36 equities (N = {2, 3,…,36})36 equities (N = {2, 3,…,36}) Usually 2- or 3-stock subsets were usedUsually 2- or 3-stock subsets were used

Reproduced each Cover experimentReproduced each Cover experiment Stocks chosen for volatility reasonsStocks chosen for volatility reasons

Found BCRP, then ran w* through Found BCRP, then ran w* through from the beginningfrom the beginning

Ran EG(Ran EG(), ExactEG(), ExactEG() through from ) through from the beginningthe beginning

Commercial Metals and Kin Commercial Metals and Kin Ark (Figure 5.1)Ark (Figure 5.1)

IBM and Coca Cola IBM and Coca Cola (Figure 5.2)(Figure 5.2)

Gulf, HP, and Schlum Gulf, HP, and Schlum (Fig 5.3)(Fig 5.3)

Volatility Elasticity Volatility Elasticity (Table 5.5)(Table 5.5)

Results Analysis Results Analysis SummarySummary

EG(EG() and ExactEG() and ExactEG() were always ) were always about 1% from each other with about 1% from each other with EG(EG() running much faster) running much faster

BCRP always did the bestBCRP always did the best EG(EG() always outperformed Cover’s ) always outperformed Cover’s

Universal Portfolio, despite Cover’s Universal Portfolio, despite Cover’s superior analytical worst-case boundsuperior analytical worst-case bound

Talking PointsTalking Points

““[S]urprisingly, the wealth achieved [S]urprisingly, the wealth achieved by the EG(by the EG() update was larger than ) update was larger than the wealth achieved by the universal the wealth achieved by the universal portfolio algorithm. This outcome is portfolio algorithm. This outcome is contrary to the superior worst-case contrary to the superior worst-case bounds proved for the universal bounds proved for the universal portfolio algorithm.”portfolio algorithm.”

Cover = O((N log T)/T)Cover = O((N log T)/T) EG(EG() = O(√((log N)/T))) = O(√((log N)/T)) Any ideas why?Any ideas why?

Talking PointsTalking Points

So, the size of N affected relative So, the size of N affected relative running times, but how did stock running times, but how did stock volatility affect relative overall volatility affect relative overall wealth?wealth?

Would running time matter in this Would running time matter in this domain if the algorithms were domain if the algorithms were applied? Why did it matter so much applied? Why did it matter so much to the authors?to the authors?

Follow UpFollow Up Transaction CostsTransaction Costs

Scottrade.com charges $7 per transactionScottrade.com charges $7 per transaction Would you update every stock every day?Would you update every stock every day?

Side InformationSide Information K-finite states of side info, available to K-finite states of side info, available to

algorithmalgorithm Computationally the same as K parallel Computationally the same as K parallel

versions running, so no big deal and may versions running, so no big deal and may increase wealthincrease wealth

Implementation DetailsImplementation Details How do we pick How do we pick ?? How do we pick wHow do we pick w11??

DoneDone

On-Line Portfolio Selection Using Multiplicative Updates Written by David P. Helmbold (Cal), Robert...

Documents

Transcript of On-Line Portfolio Selection Using Multiplicative Updates Written by David P. Helmbold (Cal), Robert...