A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint...

54
A PowerPoint Presentation PRESENTED BY Firstname Lastname August 25, 2013 Online Principal Component Analysis Boutsidis, Garber, Karnin, Liberty PRESENTED BY Zohar Karnin November 23, 2014

Transcript of A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint...

Page 1: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

A PowerPo in t P resen ta t i on

PRESENTED BY Firstname Lastname⎪ August 25, 2013

On l i ne P r inc ipa l Componen t Ana lys i s B o u t s i d i s , G a r b e r , K a r n i n , L i b e r t y

PRESENTED BY Zohar Karnin⎪ November 23, 2014

Page 2: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Data Matrix

2 Yahoo labs

§  Often, data is represented as a huge matrix

§  Sometimes, we can’t store the entire matrix

Page 3: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Principal Component Analysis

3 Yahoo labs

§  Often, we require a low rank approximation of matrix A ›  Recommender systems, images, LSA, …

§  The approximation is used to save space and often, clean up noise

A = + + +

Page 4: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Column by Column Stream

4 Yahoo labs

§  Data arrives column by column §  column=item and we’re seeing the items one at a time

Page 5: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

The Formal Stream Setup

5 Yahoo labs

§  Observe x1 2 Rd, output y1 2 Rk

Page 6: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

The Formal Stream Setup

6 Yahoo labs

§  Observe x1 2 Rd, output y1 2 Rk

§  …

Page 7: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

The Formal Stream Setup

7 Yahoo labs

§  Observe x1 2 Rd, output y1 2 Rk

§  … §  Observe xt 2 Rd,

output yt 2 Rk

Page 8: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

The Formal Stream Setup

8 Yahoo labs

Cost =   Min ©   ∑t kxt – ©ytk2

s.t   © = embedding

from Rk to Rd

  kyi-yjk=k©yi-©yjk

X

Y

Page 9: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

The Cost Function

9 Yahoo labs

Y

X

Output

Input

Page 10: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

The Cost Function

10 Yahoo labs

-

Y

©Y X

Embedding of Y into the same space of X

Page 11: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

The Cost Function

11 Yahoo labs

-

=

Y

©Y X

R=X-©Y Error matrix

Page 12: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

The Cost Function

12 Yahoo labs

-

=

Frob Error = kRkF2 = ∑ij (Xij - ©Yij) = MSE

Y

©Y X

R=X-©Y Error matrix

Page 13: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

The Cost Function

13 Yahoo labs

-

=

Frob Error = kRkF2 = ∑ij (Xij - ©Yij) = MSE

Spectral Error = kRk2 = maxkvk=1 kv>X – v>(©Y)k

Y

©Y X

R=X-©Y Error matrix

Page 14: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Secondary Costs: Computational Resources

14 Yahoo labs

§  Run time: #operations required per observed column §  Memory

Page 15: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Previous Works

15 Yahoo labs

§  Regret Minimization Setting [WK 07], [NKW 13]

§  At time t, before observing xt, predict Ut, a projection matrix onto a k dim subspace. The loss is kxt-Utxtk2

§  Each Ut can be completely different

Page 16: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Previous Works

16 Yahoo labs

§  Regret Minimization Setting [WK 07], [NKW 13]

§  At time t, before observing xt, predict Ut, a projection matrix onto a k dim subspace. The loss is kxt-Utxtk2

§  Each Ut can be completely different

§  Stochastic setting [ACS 13], [MCJ 13], [BDF 13] ›  xt are drawn i.i.d from some distribution. Objective: find U as quickly as possible

minimizing E[ kxt-Uxtk2 ]

Page 17: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Previous Works

17 Yahoo labs

§  Regret Minimization Setting [WK 07], [NKW 13]

§  At time t, before observing xt, predict Ut, a projection matrix onto a k dim subspace. The loss is kxt-Utxtk2

§  Each Ut can be completely different

§  Stochastic setting [ACS 13], [MCJ 13], [BDF 13] ›  xt are drawn i.i.d from some distribution. Objective: find U as quickly as possible

minimizing E[ kxt-Uxtk2 ]

§  Reconstruction matrix (not an embedding) [CW 09] ›  min© ∑t kxt – ©ytk2 s.t © is an arbitrary linear transformation from Rk to Rd

Page 18: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Results

18 Yahoo labs

§  X = d £ n matrix whose columns are observed

Page 19: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Results

19 Yahoo labs

§  X = d £ n matrix whose columns are observed §  k << d

Page 20: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Results

20 Yahoo labs

§  X = d £ n matrix whose columns are observed §  k << d §  Xk = Best rank k approximation of X (top k directions)

Page 21: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Results

21 Yahoo labs

§  X = d £ n matrix whose columns are observed §  k << d §  Xk = Best rank k approximation of X (top k directions) §  OPT = kX-XkkF

2

Page 22: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Results

22 Yahoo labs

§  X = d £ n matrix whose columns are observed §  k << d §  Xk = Best rank k approximation of X (top k directions) §  OPT = kX-XkkF

2 §  Theorem 1: Given kXkF, k, ²: Error = OPT + ²kXkF

2

›  Memory, Target dimension, Processing time per column = O(k/²2)

Page 23: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Results

23 Yahoo labs

§  X = d £ n matrix whose columns are observed §  k << d §  Xk = Best rank k approximation of X (top k directions) §  OPT = kX-XkkF

2 §  Theorem 1: Given kXkF, k, ²: Error = OPT + ²kXkF

2

›  Memory, Target dimension, Processing time per column = O(k/²2)

§  Theorem 2: Given k, ²: Error = OPT + ²kXkF2

›  Memory, Target dimension, Processing time per column = O(k/²3)

Page 24: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

The “Operator Norm” Cost Function

24 Yahoo labs

§  Y = output matrix [y1,…,yn]

§  Cost = kX – ©YkF2

›  Interpretation: Mean square error

kX – XkkF2 ¿ kXkkF

2 noise signal

Page 25: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

The “Operator Norm” Cost Function

25 Yahoo labs

§  Y = output matrix [y1,…,yn]

§  Cost = kX – ©YkF2

›  Interpretation: Mean square error

kX – XkkF2 ¿ kXkkF

2 kX – XkkF2 ÀkXkkF

2

but… kX – Xkk2 ¿ kXkk2

noise signal

Page 26: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

The “Operator Norm” Cost Function

26 Yahoo labs

§  Y = output matrix [y1,…,yn]

§  Cost = kX – ©YkF2

›  Interpretation: Mean square error

§  Alternative cost: kX – ©Yk2 ›  Interpretation: bounds max unit vector v, kv>X – v>©Yk

kX – XkkF2 ¿ kXkkF

2 kX – XkkF2 ÀkXkkF

2

but… kX – Xkk2 ¿ kXkk2

noise signal

Page 27: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Results

27 Yahoo labs

§  Theorem 3 [under construction] : Given kXk, kX-Xkk, k, ²: Operator Norm Error = OPToperator + ²kXk2

›  Target dimension = O(k/²)

Page 28: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Algorithm

28 Yahoo labs

§  Maintain U:Rd → R`

§  Directions are only added, never removed (for now)

•  r = Tolerable error radius = kXkF

/ `1/2

Page 29: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Algorithm

29 Yahoo labs

§  Maintain U:Rd → R`

§  Directions are only added, never removed (for now)

•  r = Tolerable error radius = kXkF

/ `1/2

Page 30: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Algorithm

30 Yahoo labs

§  Maintain U:Rd → R`

§  Directions are only added, never removed (for now)

•  r = Tolerable error radius = kXkF

/ `1/2 •  “Error ellipsoid”

Page 31: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Algorithm

31 Yahoo labs

§  Maintain U:Rd → R`

§  Directions are only added, never removed (for now)

•  r = Tolerable error radius = kXkF

/ `1/2 •  “Error ellipsoid”

Page 32: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Algorithm

32 Yahoo labs

§  Maintain U:Rd → R`

§  Directions are only added, never removed (for now)

•  r = Tolerable error radius = kXkF

/ `1/2 •  “Error ellipsoid”

Page 33: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Algorithm

33 Yahoo labs

§  Maintain U:Rd → R`

§  Directions are only added, never removed (for now)

•  r = Tolerable error radius = kXkF

/ `1/2 •  “Error ellipsoid”

Page 34: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Algorithm

34 Yahoo labs

§  Maintain U:Rd → R`

§  Directions are only added, never removed (for now)

•  r = Tolerable error radius = kXkF

/ `1/2 •  “Error ellipsoid”

Page 35: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Algorithm

35 Yahoo labs

§  Maintain U:Rd → R`

§  Directions are only added, never removed (for now)

•  r = Tolerable error radius = kXkF

/ `1/2 •  “Error ellipsoid” •  Add vector u1 to U

Page 36: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Algorithm

36 Yahoo labs

§  Maintain U:Rd → R`

§  Directions are only added, never removed (for now)

•  r = Tolerable error radius = kXkF

/ `1/2 •  “Error ellipsoid” •  Add vector u1 to U

Page 37: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Analysis: Target Dimension

37 Yahoo labs

•  r = Tolerable error radius = kXkF / `1/2

Target dimension = number of vectors added to U

Page 38: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Analysis: Target Dimension

38 Yahoo labs

•  r = Tolerable error radius = kXkF / `1/2

Target dimension = number of vectors added to U Obs: adding a vector to U means requires kXkF

2 / ` weight from kXkF

2

Page 39: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Analysis: Target Dimension

39 Yahoo labs

•  r = Tolerable error radius = kXkF / `1/2

Target dimension = number of vectors added to U Obs: adding a vector to U means requires kXkF

2 / ` weight from kXkF

2 ) number of vectors added to U · `

Page 40: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Analysis: Cost

40 Yahoo labs

•  “Error ellipsoid” •  Y = output matrix •  R = error matrix = X-Un

>Y Operator norm cost = kRk2 = max{r1

2,r22}

Cost = kRkF2 = r1

2+r22

r1

r2

Page 41: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Analysis: Cost

41 Yahoo labs

•  r = Tolerable error radius = kXkF / `1/2

•  “Error ellipsoid” •  Y = output matrix •  R = error matrix = X-Un

>Y Statements: •  kRk2 · r2 = kXkF

2 / ` •  kRkF

2 · loss from Xk + loss from X-Xk · kXkF

2 (k/`)1/2 + kX-XkkF2

Page 42: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Implementation: Memory and Run-time Complexity

42 Yahoo labs

rt = xt – Ut xt

R = [r1, r2, …, rt]

Page 43: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Implementation: Memory and Run-time Complexity

43 Yahoo labs

rt = xt – Ut xt

R = [r1, r2, …, rt] §  Straightforward version requires maintaining RR>

›  Update time, memory requirements = d2

Page 44: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Implementation: Memory and Run-time Complexity

44 Yahoo labs

rt = xt – Ut xt

R = [r1, r2, …, rt] §  Straightforward version requires maintaining RR>

›  Update time, memory requirements = d2

§  Instead: Maintain Z: d£` matrix such that ZZ> ¼ RR>

§  kZZ>- RR>k< kRkF2/`

§  [Lib 12] Update time, memory requirements = d`

Page 45: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Implementation: Unknown Horizon

45 Yahoo labs

Error radius parameter = kXkF / `1/2

Page 46: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Implementation: Unknown Horizon

46 Yahoo labs

Error radius parameter = kXkF / `1/2

§  Def: Xt = [x1,…,xt] §  Idea: use growing radius parameter kXtkF

/ `1/2

Page 47: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Implementation: Unknown Horizon

47 Yahoo labs

Error radius parameter = kXkF / `1/2

§  Def: Xt = [x1,…,xt] §  Idea: use growing radius parameter kXtkF

/ `1/2 §  Thm: works as before, but target dimension =

`·log(n)

Page 48: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Implementation: Unknown Horizon

48 Yahoo labs

Error radius parameter = kXkF / `1/2

§  Def: Xt = [x1,…,xt] §  Idea: use growing radius parameter kXtkF

/ `1/2 §  Thm: works as before, but target dimension =

`·log(n)

Page 49: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Implementation: Unknown Horizon

49 Yahoo labs

Error radius parameter = kXkF / `1/2

§  Def: Xt = [x1,…,xt] §  Idea: use growing radius parameter kXtkF

/ `1/2 §  Thm: works as before, but target dimension =

`·log(n)

Page 50: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Implementation: Unknown Horizon

50 Yahoo labs

Error radius parameter = kXkF / `1/2

§  Def: Xt = [x1,…,xt] §  Idea: use growing radius parameter kXtkF

/ `1/2 §  Thm: works as before, but target dimension =

`·log(n)

Page 51: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Implementation: Unknown Horizon

51 Yahoo labs

Error radius parameter = kXkF / `1/2

§  Def: Xt = [x1,…,xt] §  Idea: use growing radius parameter kXtkF

/ `1/2 §  Thm: works as before, but target dimension =

`·log(n) ›  Divide time into epochs, in each epoch, N · kXtkF

· 2N ›  At most ` directions are added in each epoch

Page 52: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Implementation: Unknown Horizon

52 Yahoo labs

Error radius parameter = kXkF / `1/2

§  Def: Xt = [x1,…,xt] §  Idea: use growing radius parameter kXtkF

/ `1/2 §  Thm: works as before, but target dimension =

`·log(n) ›  Divide time into epochs, in each epoch, N · kXtkF

· 2N ›  At most ` directions are added in each epoch

§  Idea 2: if direction u becomes weak (ku>Xtk¿ kXtkF / `1/2) remove it

§  Thm: works as before, target dimension = ` / ²

Page 53: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Conclusions and Open Questions

53 Yahoo labs

§  We obtain error = OPT + ²kXkF2 with target dimension O(k/²3). Can we

reduce the dependence on ²? §  Improve to OPT(1+²) ? §  Lower bound? (currently same for arbitrary reconstruction matrix) §  Obtain approximation of OPT + ²kX-Xkk2

Page 54: A PowerPoint Presentation Online Principal Component Analysis · 2019. 11. 26. · A PowerPoint Presentation PRESENTED BY Firstname Lastname! August 25, 2013 Online Principal Component

Thank you!

54 Yahoo labs