EE 378B: Inference, Estimation, and Information Processing ...cs-people.bu.edu › orecchia ›...

86
EE 378B: Inference, Estimation, and Information Processing Lecture 4 Andrea Montanari Stanford University April 8, 2015 Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 1 / 53

Transcript of EE 378B: Inference, Estimation, and Information Processing ...cs-people.bu.edu › orecchia ›...

  • EE 378B: Inference, Estimation,

    and Information Processing

    Lecture 4

    Andrea Montanari

    Stanford University

    April 8, 2015

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 1 / 53

  • Outline

    1 A reminder

    2 No, seriously, why does this work?

    3 Examples

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 2 / 53

  • ALERT: A LOT OF (COOL) LINEAR ALGEBRA!!!!

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 3 / 53

  • ALERT: PERTURBATION THEORY FOR LINEAR OPERATORS!!!!

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 4 / 53

  • A reminder

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 5 / 53

  • Laplacian

    Similarity matrix

    Aij = similarity btw i and j (= Aji )

    Degree matrix

    D = diag(d) , di =X

    j2[n]

    Aij

    (Normalized) Laplacian matrix

    Ln

    = I � D�1/2AD�1/2

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 6 / 53

  • The algorithm

    Spectral Clustering

    Input : Similarity matrix A 2 Rn⇥nOutput : k clusters S

    1

    , . . . , Sk1: Compute the Laplacian L

    n

    = I � D�1/2AD�1/2;2: Compute the first k eigenvectors of L

    n

    , u1

    , . . . , uk ;3: For each i 2 {1, . . . , n}4: Let yi = (u1,i , . . . , uk,i ) 2 Rk5: Set xi = yi/kyik5: Cluster {x

    1

    , . . . , xn} using K-Means

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 7 / 53

  • Consequences

    Proposition

    (1) Ln

    ⌫ 0(2) If G connected, then 0 = �

    1

    (Ln

    ) < �2

    (Ln

    ) · · · �n(Ln) and. . .(3) . . . u

    1,i = Cpdi .

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 8 / 53

  • k disconnected components

    [n] = S1

    [ S2

    [ · · · [ Sk , ASi ,Sj = 0

    Ln

    =

    0

    B

    B

    B

    @

    L(1)n

    0 0 0

    0 L(2)n

    0 00 0 · · · 00 0 0 L(k)

    n

    1

    C

    C

    C

    A

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 9 / 53

  • Subtlety. . .

    Compute first k eigenvectors.

    eY = [u1

    | · · · |uk ] 2 Rn⇥k

    eY = YR , R 2 Rk⇥k , R⇤R = I

    xi = yi/kyik = (1, 0, 0, 0, 0)R for i 2 S1xi = yi/kyik = (0, 1, 0, 0, 0)R for i 2 S1xi = yi/kyik = (0, 0, 1, 0, 0)R for i 2 S3. . .

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 10 / 53

  • Not too bad!

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 11 / 53

  • No, seriously, why does this work?

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 12 / 53

  • Idea

    G ‘close to’ k disconnected components

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 13 / 53

  • Let us try to formalize

    Ln

    2 Rn⇥n normalized LaplacianE0

    = [u1

    , . . . , uk ] 2 Rn⇥k , E ⇤0

    E0

    = Ik⇥k , first k eigenvectors

    Meta theorem

    If L0n

    is close to Ln

    then E 00

    is close to E0

    .

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 14 / 53

  • ‘Close to’

    M 2 Rm⇥n

    kMk2

    = �max

    (M)

    = max�

    kMuk : kuk = 1

    = max�

    hv ,Mui : kuk = 1, kvk = 1

    (orthogonal invariant kMk = kRMSk for RR⇤ = SS⇤ = I )

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 15 / 53

  • Let us try to formalize, again

    Two matrices A, A+ H 2 Rn⇥n symmetric.E0

    ,F0

    2 Rn⇥k orthogonal

    E0

    ,F0

    bases of eigenspaces:

    AE0

    = E0

    A0

    ,

    (A+ H)F0

    = F0

    B0

    Meta Theorem

    d(E0

    ,F0

    ) somefunctionof(kHk2

    )

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 16 / 53

  • Let us try to formalize, again

    d(E0

    ,F0

    ) must depend only on the space spanned by cols of E0

    ,F0

    dp

    (E0

    ,F0

    ) ⌘ kE0

    E ⇤0

    � F0

    F ⇤0

    k2

    Lemma (1)

    If F = [F0

    |F1

    ] 2 Rn⇥n with F ⇤F = I , then

    dp

    (E0

    ,F0

    ) = kF ⇤1

    E0

    k2

    = kF ⇤0

    E1

    k2

    .

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 17 / 53

  • Another way to look at it: Principal angles

    E ⇤0

    F0

    = U cos⇥V ⇤ , U,V 2 O(k , k)

    O(m, n) =�

    Q 2 Rm⇥n : Q⇤Q = In⇥n

    ⇥ = diag(✓1

    , . . . , ✓k)

    (for instance, if k = 1, e⇤0

    f0

    = cos ✓)

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 18 / 53

  • Another way to look at it: Principal angles

    E ⇤0

    F0

    = U cos⇥V ⇤ , U,V 2 O(k)

    Lemma (2)

    kE ⇤0

    F1

    k2

    = k sin⇥k2

    = max�

    | sin ✓1

    |, . . . , | sin ✓k |

    Lemma (1)+ Lemma (2) ) dp

    (E0

    ,F0

    ) = k sin⇥k2

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 19 / 53

  • Proof of Lemma (2)

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 20 / 53

  • Proof of Lemma (1)

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 21 / 53

  • Proof of Lemma (1)

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 22 / 53

  • Yet another distance

    dc

    (E0

    ,F0

    ) = minQ,R2O(k)

    kE0

    Q � F0

    Rk2

    = minR2O(k)

    kE0

    � F0

    Rk2

    Lemma

    dc

    (E0

    ,F0

    ) =�

    �2 sin⇥/2�

    2

    .

    Corollary

    dp

    (E0

    ,F0

    ) dc

    (E0

    ,F0

    ) p2 d

    p

    (E0

    ,F0

    ) .

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 23 / 53

  • Proof of the Lemma

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 24 / 53

  • Finally, a theorem! The setting

    Two matrices A, A+ H 2 Rn⇥n symmetric.

    E0

    ,F0

    2 Rn⇥k , E1

    ,F1

    2 Rn⇥(n�k)

    E = [E0

    |E1

    ], F = [F0

    |F1

    ] 2 Rn⇥n orthogonal

    E0

    ,F0

    bases of eigenspaces:

    A = E

    A0

    00 A

    1

    E ⇤ ,

    A+ H = F

    B0

    00 B

    1

    F ⇤ .

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 25 / 53

  • Finally, a theorem!

    Theorem (Davis, Kahan sin theta theorem)

    Assume eval(A0

    ) ✓ [a, b], eval(B1

    ) ✓ (�1, a� �) [ (b + �,1). Then

    dp

    (E0

    ,F0

    ) 1�kHk

    2

    .

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 26 / 53

  • Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 27 / 53

  • Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 28 / 53

  • Let us apply it to our case

    Ln

    , eigenvalues �1

    , �2

    ,. . .L0n

    , eigenvalues �01

    , �02

    ,. . .

    Theorem

    Letting Y = [u1

    | . . . |uk ], Y 0 = [u01

    | . . . |u0k ], we have

    dc

    (Y ,Y 0) p2

    �0k+1kL0

    n

    � Ln

    k2

    .

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 29 / 53

  • More explicitly

    Ln

    , eigenvalues �1

    , �2

    ,. . .L0n

    , eigenvalues �01

    , �02

    ,. . .

    Theorem

    There exists Q 2 O(k) such that, letting Y = [u1

    | . . . |uk ],Y 0 = [u0

    1

    | . . . |u0k ]Q, we have

    kY � Y 0k2

    p2

    �0k+1kL0

    n

    � Ln

    k2

    .

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 30 / 53

  • Even more explicitly

    Ln

    , eigenvalues �1

    , �2

    ,. . .L0n

    , eigenvalues �01

    , �02

    ,. . .

    yi rows of Y

    1

    n

    nX

    i=1

    kyi � y 0i k2 =1

    nkY � Y 0k2F

    k

    nkY � Y 0k2

    2

    Theorem

    There exists Q 2 O(k) such that, letting Y = [u1

    | . . . |uk ],Y 0 = [u0

    1

    | . . . |u0k ]Q, we have

    1

    n

    nX

    i=1

    kyi � y 0i k2 2k

    n(�0k+1)2

    kL0n

    � Ln

    k22

    .

    . . .�0k+1 is tricky. . .Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 31 / 53

  • A usable theorem

    Ln

    , eigenvalues �1

    , �2

    ,. . .

    Theorem

    There exists Q 2 O(k) such that, letting Y = [u1

    | . . . |uk ],Y 0 = [u0

    1

    | . . . |u0k ]Q, we have

    1

    n

    nX

    i=1

    kyi � y 0i k2 2k

    n(�k+1 � kL0n

    � Ln

    k2

    )2kL0

    n

    � Ln

    k22

    .

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 32 / 53

  • Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 33 / 53

  • EXAMPLES!!!!

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 34 / 53

  • First example

    A =

    0 00 a

    ,

    H =

    0 hh 0

    n = 2, k = 1, kHk2

    = h,

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 35 / 53

  • Let’s apply the sin theta theorem

    E0

    = [e0

    ] = eigenspace with smallest eigenvalue of AF0

    = [f0

    ] = eigenspace with smallest eigenvalue of B = A+ H

    � = ???

    Need to compute the eigenvalues of A+ H

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 36 / 53

  • Eigenvalues

    A+ H =

    0 hh a

    ,

    �1,2 =

    a

    2

    n

    1±r

    1 +4h2

    a2

    o

    .

    Can take � = a (a bit more if h > 0. . . )

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 37 / 53

  • Sin Theta

    sin ✓ =q

    1� he0

    , f0

    i2 kHk2�

    ha.

    Explicit calculation (eh = h/a, � =p

    1 + 4eh2)

    e0

    = (1, 0)

    f0

    =1

    (2�+ 2�2)1/2�

    1 +�,�2eh�

    ,

    hf0

    , e0

    i = 1 +�(2�+ 2�2)1/2

    = 1� h2

    2a2+ O(h4)

    Not bad!!

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 38 / 53

  • Using our usable theorem. . .

    Theorem

    There exists Q 2 O(k) such that, letting Y = [u1

    | . . . |uk ],Y 0 = [u0

    1

    | . . . |u0k ]Q, we have

    1

    n

    nX

    i=1

    kyi � y 0i k2 2k

    n(�k+1 � kHk2)2kHk2

    2

    .

    Applying it to our case: k = 1, �2

    = a, kHk2

    = h

    1

    2ke

    0

    � f0

    k2 h2

    (a� h)2 ,

    he0

    , f0

    i � 1� h2

    (a� h)2 .

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 39 / 53

  • Slightly more interesting example

    Similarity matrix

    A =

    0

    B

    B

    B

    B

    B

    B

    B

    B

    B

    B

    B

    B

    @

    p p p q q q q q qp p p q q q q q qp p p q q q q q qq q q p p p q q qq q q p p p q q qq q q p p p q q qq q q q q q p p pq q q q q q p p pq q q q q q p p p

    1

    C

    C

    C

    C

    C

    C

    C

    C

    C

    C

    C

    C

    A

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 40 / 53

  • Slightly more interesting example

    Laplacian

    I � L0n

    =1

    qn + (p � q)n/k

    0

    B

    B

    B

    B

    B

    B

    B

    B

    B

    B

    B

    B

    @

    p p p q q q q q qp p p q q q q q qp p p q q q q q qq q q p p p q q qq q q p p p q q qq q q p p p q q qq q q q q q p p pq q q q q q p p pq q q q q q p p p

    1

    C

    C

    C

    C

    C

    C

    C

    C

    C

    C

    C

    C

    A

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 41 / 53

  • Slightly more interesting example

    Laplacian

    eL0n

    =

    0

    B

    B

    B

    B

    B

    B

    B

    B

    B

    B

    B

    B

    @

    p p p q q q q q qp p p q q q q q qp p p q q q q q qq q q p p p q q qq q q p p p q q qq q q p p p q q qq q q q q q p p pq q q q q q p p pq q q q q q p p p

    1

    C

    C

    C

    C

    C

    C

    C

    C

    C

    C

    C

    C

    A

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 42 / 53

  • Slightly more interesting example

    ‘Unperturbed’ Laplacian

    eLn

    =

    0

    B

    B

    B

    B

    B

    B

    B

    B

    B

    B

    @

    p � q p � q p � q 0 0 0 0 0 0p � q p � q p � q 0 0 0 0 0 0p � q p � q p � q 0 0 0 0 0 00 0 0 p � q p � q p � q 0 0 00 0 0 p � q p � q p � q 0 0 00 0 0 p � q p � q p � q 0 0 00 0 0 0 0 0 p � q p � q p � q0 0 0 0 0 0 p � q p � q p � q0 0 0 0 0 0 p � q p � q p � q

    1

    C

    C

    C

    C

    C

    C

    C

    C

    C

    C

    A

    Top eigenvectors

    u1

    = (1, 1, 1, 0, 0, 0, 0, 0, 0)

    u2

    = (0, 0, 0, 1, 1, 1, 0, 0, 0)

    u3

    = (0, 0, 0, 0, 0, 0, 1, 1, 1)

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 43 / 53

  • The perturbation

    eL0n

    � eLn

    =

    0

    B

    B

    B

    B

    B

    B

    B

    B

    B

    B

    B

    B

    @

    q q q q q q q q qq q q q q q q q qq q q q q q q q qq q q q q q q q qq q q q q q q q qq q q q q q q q qq q q q q q q q qq q q q q q q q qq q q q q q q q q

    1

    C

    C

    C

    C

    C

    C

    C

    C

    C

    C

    C

    C

    A

    keLn

    � eL0n

    k2

    = qn

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 44 / 53

  • Gap of the unpertutbed Laplacian

    eLn

    =

    0

    B

    B

    B

    B

    B

    B

    B

    B

    B

    B

    @

    p � q p � q p � q 0 0 0 0 0 0p � q p � q p � q 0 0 0 0 0 0p � q p � q p � q 0 0 0 0 0 00 0 0 p � q p � q p � q 0 0 00 0 0 p � q p � q p � q 0 0 00 0 0 p � q p � q p � q 0 0 00 0 0 0 0 0 p � q p � q p � q0 0 0 0 0 0 p � q p � q p � q0 0 0 0 0 0 p � q p � q p � q

    1

    C

    C

    C

    C

    C

    C

    C

    C

    C

    C

    A

    Eigenvalues (recall that we rescaled and added multiple of identity)

    �1

    (eLn

    ) = · · · = �k(eLn) = (p � q)n

    k,

    �k+1(eLn) = · · · = �n(eLn) = 0 ,

    �1

    (eLn

    )� �k+1(eLn) = (p � q)n/kAndrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 45 / 53

  • Using our usable theorem

    Theorem

    There exists Q 2 O(k) such that, letting Y 0 = [u01

    | . . . |u0k ]Q, we have

    1

    n

    nX

    i=1

    ky 0i � eCluster(i)k2 2k

    n(�1

    � �k+1 � kL0n

    � Ln

    k2

    )2kL0

    n

    � Ln

    k22

    .

    with e1

    = (1, 0, . . . , 0), e2

    = (0, 1, 0, . . . , 0),. . . .

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 46 / 53

  • Using our usable theorem

    Theorem

    There exists Q 2 O(k) such that, letting Y 0 = [u01

    | . . . |u0k ]Q, we have

    1

    n

    nX

    i=1

    ky 0i � eCluster(i)k2 2k

    n((p � q)(n/k)� qn)2 q2n2

    2k3q2

    n(p � q � qk)2

    with e1

    = (1, 0, . . . , 0), e2

    = (0, 1, 0, . . . , 0),. . . .

    Works q < p/(k + 1)� "

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 47 / 53

  • Better estimate of �

    �1

    (eLn

    ) = · · · = �k(eLn) = (p � q)n

    k,

    �k+1(eLn) = · · · = �n(eLn) = 0 ,

    �1

    (eL0n

    ) = qn + (p � q)nk,

    �2

    (eL0n

    ) = · · · = �k(eL0n

    ) = (p � q)nk,

    �k+1(eL0n

    ) = · · · = �n(eL0n

    ) = 0 ,

    Can take � = (p � q)n/kAndrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 48 / 53

  • Better estimate of �

    Theorem

    There exists Q 2 O(k) such that, letting Y 0 = [u01

    | . . . |u0k ]Q, we have

    1

    n

    nX

    i=1

    ky 0i � eCluster(i)k2 2k3q2

    n(p � q)2

    with e1

    = (1, 0, . . . , 0), e2

    = (0, 1, 0, . . . , 0),. . . .

    Works for

    q pn

    1� Ck3/2

    n1/2

    o

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 49 / 53

  • Well, this was not that impressive. . . :-(

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 50 / 53

  • A reasonably interesting example

    Aij =

    1 with probability Pij0 otherwise.

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 51 / 53

  • Where

    P =

    0

    B

    B

    B

    B

    B

    B

    B

    B

    B

    B

    B

    B

    @

    p p p q q q q q qp p p q q q q q qp p p q q q q q qq q q p p p q q qq q q p p p q q qq q q p p p q q qq q q q q q p p pq q q q q q p p pq q q q q q p p p

    1

    C

    C

    C

    C

    C

    C

    C

    C

    C

    C

    C

    C

    A

    Does this behave as before?

    Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 52 / 53

  • Much more impressive!

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    n = 150, k = 3 p = 0.6, q = 0.4Andrea Montanari (Stanford University) EE 378B: Lecture 4 April 8, 2015 53 / 53

  • EE 378B: Inference, Estimation,and Information Processing

    Lecture 6

    Andrea Montanari

    Stanford University

    April 15, 2015

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 1 / 32

  • Outline

    1 The planted model. . .

    2 . . . and its analysis

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 2 / 32

  • The planted model

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 3 / 32

  • Theoretical Computer Science: Planted partition model.

    Statistics: Stochastic block model.

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 4 / 32

  • Probability matrix

    P =

    0

    B

    B

    B

    B

    B

    B

    B

    B

    B

    B

    B

    B

    @

    p p p q q q q q qp p p q q q q q qp p p q q q q q qq q q p p p q q qq q q p p p q q qq q q p p p q q qq q q q q q p p pq q q q q q p p pq q q q q q p p p

    1

    C

    C

    C

    C

    C

    C

    C

    C

    C

    C

    C

    C

    A

    k clusters of size n/k 0 q < p 1.

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 5 / 32

  • Similarity

    Aij =

    1 with probability Pij0 otherwise.

    Think of this as the adjacency matrix of a graph G .

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 6 / 32

  • Here is how it looks like

    0.00.2

    0.40.6

    0.81.0

    0.0 0.2 0.4 0.6 0.8 1.0

    n = 150, k = 3 p = 0.6, q = 0.4

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 7 / 32

  • Can you see the clusters?

    0.00.2

    0.40.6

    0.81.0

    0.0 0.2 0.4 0.6 0.8 1.0

    n = 150, k = 3 p = 0.6, q = 0.4

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 8 / 32

  • Can you see the clusters?

    0.00.2

    0.40.6

    0.81.0

    0.0 0.2 0.4 0.6 0.8 1.0

    n = 150, k = 3 p = 0.6, q = 0.4

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 9 / 32

  • Permuting randomly the items

    0.00.2

    0.40.6

    0.81.0

    0.0 0.2 0.4 0.6 0.8 1.0

    n = 150, k = 3 p = 0.6, q = 0.4

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 10 / 32

  • Permuting randomly the items

    0.00.2

    0.40.6

    0.81.0

    0.0 0.2 0.4 0.6 0.8 1.0

    n = 150, k = 3 p = 0.6, q = 0.4

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 11 / 32

  • Will we succeed?

    ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

    0 50 100 150

    020

    4060

    Index

    eM$val

    n = 150, k = 3 p = 0.6, q = 0.4

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 12 / 32

  • Will we succeed?

    −0.1 0.0 0.1 0.2

    −0.2

    −0.1

    0.0

    0.1

    0.2

    eM$vec[, 2]

    eM$v

    ec[,

    3]

    1

    11

    1

    1

    1

    11

    1

    1

    11 11

    11

    1

    1

    1

    1

    1

    111

    11

    11

    11

    1

    1

    1

    11

    1

    11 1

    11

    11

    1

    1

    1

    1

    1

    1

    1

    2

    2

    22

    2

    2

    22

    2

    2

    2

    2

    2

    2

    2

    22

    222

    2

    2

    222

    2

    2

    22

    22

    22

    2 2

    2

    2

    2

    2

    2

    2

    2

    2

    2

    2

    22 2

    2

    2

    3 3

    33

    3

    3

    33

    33

    3 3

    33

    3

    33

    3

    3

    33

    3

    3 33

    3

    33

    3

    3

    3

    3

    33

    3

    33

    3

    3

    3 33 3

    3

    3

    33

    3

    3

    3

    n = 150, k = 3 p = 0.6, q = 0.4

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 13 / 32

  • Slightly di↵erent p and q

    0.00.2

    0.40.6

    0.81.0

    0.0 0.2 0.4 0.6 0.8 1.0

    n = 150, k = 3 p = 0.65, q = 0.35

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 14 / 32

  • Will we succeed?

    ●●

    ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

    0 50 100 150

    020

    4060

    Index

    eM$val

    n = 150, k = 3 p = 0.65, q = 0.35

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 15 / 32

  • Will we succeed?

    −0.10 −0.05 0.00 0.05 0.10 0.15

    −0.1

    5−0

    .10

    −0.0

    50.

    000.

    050.

    100.

    15

    eM$vec[, 2]

    eM$v

    ec[,

    3]

    1

    12

    11

    3

    1

    3

    3

    11

    2

    3

    3

    2 2

    2

    1

    12

    1

    21

    1

    12

    1

    1

    1

    1

    3

    1

    3

    1

    2

    2

    1

    1

    11

    1

    3

    2

    21

    2

    2

    3

    3

    2

    2

    2

    3

    1

    1

    1

    33

    1

    2

    33

    33

    3

    3

    1

    2

    3

    2

    3

    1

    2 2 2

    1

    3

    33

    2

    1

    3

    2

    2

    3 3

    2

    22

    12

    1

    3

    22

    3

    2

    1

    2

    3

    1

    3

    1

    1

    3

    3

    2

    3

    3

    1

    3

    3

    1

    3

    2

    2

    3

    2

    1

    33

    1

    2

    2

    3

    3

    2

    3

    2

    2

    1

    2

    2

    3

    3

    1

    3

    2

    1

    2

    3

    1

    3

    2

    1

    2

    1

    2

    3

    1

    n = 150, k = 3 p = 0.65, q = 0.35

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 16 / 32

  • . . . and its analysis

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 17 / 32

  • The question

    p =1

    2(1 + ") , q =

    1

    2(1� ")

    Tradeo↵ n vs "???

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 18 / 32

  • I cheated a little bit!!!

    Largest eigenvalues/eigenvectors of A (instead of Laplacian)

    I � L = D�1/2AD�1/2

    D = diag(d1

    , d2

    , . . . , dn)

    di =X

    j2[n]

    Aij

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 19 / 32

  • Plot di vs i

    ●●●

    ●●

    ●●

    ●●

    ●●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●●

    ●●●●

    ●●

    ●●●

    ●●

    ●●

    ●●

    ●●

    ●●

    0 50 100 150

    020

    4060

    80100

    Index

    sum

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 20 / 32

  • Why?

    di =n

    X

    j=1

    Aij ⇡n

    X

    j=1

    Pij =n

    k(p � q) + nq

    ThereforeD ⇡ const. · I

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 21 / 32

  • Strategy

    Need to understand the top k eigenvectors of A

    IdeaA = P + X X = A� P , E{X} = 0

    X is a small perturbation

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 22 / 32

  • Eigenvectors/eigenvalues of P

    P =

    0

    B

    B

    B

    B

    B

    B

    B

    B

    B

    B

    B

    B

    @

    p p p q q q q q qp p p q q q q q qp p p q q q q q qq q q p p p q q qq q q p p p q q qq q q p p p q q qq q q q q q p p pq q q q q q p p pq q q q q q p p p

    1

    C

    C

    C

    C

    C

    C

    C

    C

    C

    C

    C

    C

    A

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 23 / 32

  • Eigenvectors/eigenvalues of P

    u1

    = C (1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)

    u2

    = C 0(k � 1, k � 1, k � 1,�1,�1,�1,�1,�1,�1,�1,�1,�1)· · ·

    uk = C0(�1,�1,�1,�1,�1,�1,�1,�1,�1, k � 1, k � 1, k � 1)

    �1

    = qn + (p � q)nk,

    �2

    = · · · = �k = (p � q)n

    k�k+1 = · · · = �n = 0

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 24 / 32

  • Applying perturbation theory

    y1

    , . . . , yn 2 Rk embedding from spectral method (up to rotation)e1

    , . . . , ek 2 Rk standard basis

    1

    n

    nX

    i=1

    kyi � ecluster(i)k2

    2k

    n((p � q)(n/k)� kXk2

    )2kXk2

    2

    .

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 25 / 32

  • Applying perturbation theory

    y1

    , . . . , yn 2 Rk embedding from spectral method (up to rotation)e1

    , . . . , ek 2 Rk standard basis

    1

    n

    nX

    i=1

    kyi � ecluster(i)k2

    2k

    n(n"/k � kXk)2 kXk2

    2

    .

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 26 / 32

  • The single most important fact in random matrix theory

    Claim

    kXk2

    pn

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 27 / 32

  • Applying perturbation theoryy1

    , . . . , yn 2 Rk embedding from spectral method (up to rotation)e1

    , . . . , ek 2 Rk standard basis

    1

    n

    nX

    i=1

    kyi � ecluster(i)k2

    2k

    n(n"/k � kXk)2 kXk2

    2

    2k(n"/k �

    pn)2

    k5n

    " � 10kpn

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 28 / 32

  • Proving the claim

    Claim

    kXk2

    pn

    IID entries (i j)

    Xij =

    1� Pij with prob Pij�Pij otherwise

    E{Xij} = 0, |Xij | 1

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 29 / 32

  • A first attempt

    E{kXk22

    } E{kXk2F} =n

    X

    i ,j=1

    E{X 2ij } n2

    P{kXk2

    � 10n} 110

    This is A BAD BOUND!

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 30 / 32

  • A hammer

    Theorem (Matrix Bernstein inequality)

    Z1

    , . . . ,Zm 2 Rn⇥n independent symmetric, EZi = 0 kZik2 R ,kP

    i E{Z 2i }k �2. Let X = Z1 + Z2 + · · ·+ Zm. Then

    P{kXk � t} n expn

    � t2

    6(Rt + �2)

    o

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 31 / 32

  • In our case

    Zij = Xij(eie⇤j + eje

    ⇤i )

    kZijk2 1

    Z 2ij = X2

    ij (eie⇤i + eje

    ⇤j )

    X

    ijE{Z 2ij } = diag(b) , bi =

    X

    j

    E{X 2ij } n

    X

    ijE{Z 2ij }

    2

    2n

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 32 / 32

  • Substituting

    P{kXk � t} n expn

    � t2

    6(Rt + �2)

    o

    n expn

    � t2

    6(t + n)

    o

    Therefore

    P{kXk � 10p

    n log n} n expn

    � 100n log n6(n + 10

    pn)

    o

    n expn

    � 100n log n20n

    o

    ne�5 log n = n�4

    This is an OK bound

    Andrea Montanari (Stanford University) EE 378B: Lecture 6 April 15, 2015 33 / 32

    A reminderNo, seriously, why does this work?Examples