Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

31
1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II Paul Beame University of Washington k with Erik Vee, Mike Saks, T.S. Jayram, Xia

description

Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II. Paul Beame University of Washington. joint work with Erik Vee, Mike Saks, T.S. Jayram, Xiaodong Sun. The Trace of an Input. v 0. Partition a subset of the layers L j into sets  1 ,  2. L 1. v 1. L 2. - PowerPoint PPT Presentation

Transcript of Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

Page 1: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

1

Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

Paul BeameUniversity of Washington

joint work with Erik Vee, Mike Saks, T.S. Jayram, Xiaodong Sun

Page 2: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

2

The Trace of an Input

v0

10

kn

kn

r

kn

r

L1

L2

L5

The trace of input x

• the sequence of nodes reached on input x as the computation moves from one set i to the other

•E.g. trace(x) =(v1,v2,v3)

• a = length of trace = # of alternations in the partition

• 2Sa possible traces

v1

v2

v3

Partition a subset of the layers Lj into sets 1, 2

Page 3: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

3

Embedded (m,)-rectangles

An embedded (m,)-rectangle R Dn is a subset defined by disjoint sets A,B {1,...,n}, feet a partial assignment DAUB, spine sets of assignments RA DA, RB DB legs

R = { z | zAUB = , zA RA, zB RB } |A|,|B| = m |RA|/|DA|, |RB|/|DB| density

Page 4: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

4

An embedded (m,)-rectangle

RA

A B

RB

x1 xnmm

Wlog A B

RA and RB each have density at least

DA

DB

RA

RB

spine

feet

legs

Dn

Page 5: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

5

Properties of a set of layers

r layers (of height kn/r)

Let Layers(x,i) be the set of layers in which variable xi is read on input x

For a set of layers, unread(x, ) = { i : Layers(x,i) = } core(x, ) = { i : Layers(x,i) }

Page 6: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

6

Embedded rectangle partition (1,2) of f-1(1) induced by 1, 2

Two inputs x,y f-1(1) are equivalent iff

trace(x, 1, 2)= trace(y, 1, 2) core(x, 1)= core(y, 1) core(x, 2)= core(y, 2)

stem(x, 1, 2)= stem(y, 1, 2) where

stem(x, 1, 2) is the partial assignment that has the values of x

outside core(x, 1) and core(x, 2) Fixing the trace and the two cores induces the partition into pseudo-

rectangles we used before Fixing the stems, fixes the common part of each pseudo-rectangle and

produces the embedded rectangles we later reasoned about

Page 7: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

7

Previous argument

Throw out all embedded rectangles in (1,2) for which |core( , 1)| or |core( , 2)| is smaller than m

Compute density bound on what’s left

Problem with applying it to the Boolean case The density bound is too small

Denominator contains 2(k 1)m

2n

2m

Better density bounds?

Page 8: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

8

Boolean bounds

This talk will cover Density-bounding technique from [Ajtai 99a] with

improvements from [B-Saks-Sun-Vee 00]

Yields density 2m which is large enough for the Boolean case

Yields

n/ ST n

n/ S

log

loglog

Page 9: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

9

Generalized method for choosing 1, 2

Generalization of the method from [BRS 89], [BST 98]

Distribution q for probability q 1/2 Pr[Li 1] = Pr[Li 2] = q

Pr[Li 1 2] = 1 2q

Independent for each i

E[|core(x, 1)|]= E[|core(x, 2)|] n qk

Page 10: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

10

Second Moment Method

Var[|core(x, )|] (k2n/r) E[|core(x, )|] = (k2n/r)

By Chebyshev’s inequality

Pr[ /2 |core(x, )| 3/2]

1 Var[|core(x, )|]/( /2)2

1 4k2(1/q)k/r

since n qk

Choose r=8k2(1/q)k

Page 11: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

11

First fix the trace

• f-1(1) and (1,2) are both disjoint unions over the 2Sa choices of the trace

• we’ll bound densities in each separately

From now on when working with a fixed partition, without saying it

explicitly, we will usually assume that the function f=ft for some trace t

Page 12: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

12

A simplifying assumption

On every input the BP reads every variable at least once

• Can easily ensure this by starting with n dummy queries

• Why bother?

• It gives an alternate characterization of core(x, 1)

• core(x, 1) = unread(x, 1)

Page 13: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

13

|RA| = # of ways of varying x on A and staying in R

Analyzing density - key observations

RA

RB

Embedded rectangle R in (1,2)

Every x in R has A=core(x, 1) and B=core(x, 2)

Let be the part outside A of some x in R

|RA| = # of ways of extending and staying in R

super-stem

Page 14: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

14

How the cores can vary

v0

10

L1

L2

L5

v1

v2

v4

1, 2, rest

Path of xPath of y

i core(x, 1) xi not read outside 1 on input x xi not read outside 1 on input y i core(y, 1)

v3

Page 15: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

15

|RA| = # of ways of varying x on A and staying in R

Analyzing density - key observations

RA

RB

Embedded rectangle R in (1,2)

Every x in R has A=core(x, 1) and B=core(x, 2)

Let be the part outside A of some x in R

Any input yDn agreeing with has A=core(y, 1)

|RA| = # of ways of extending and staying in R

super-stem

Page 16: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

16

Lower-bounding density of rectangles

Look at rectangles that contain assignments in f-1(1) (DA) R1

A, R2A, R3

A,… partition the projection of f-1(1) (DA) on A

To show that most inputs are in rectangles R

with large |RA| it suffices to show that Any assignment super-stems(1) is consistent

with very few rectangles: numrects()

I.e., show numrects() is small relative to |D|n||

Page 17: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

17

Bounding numrects()

For super-stems(1), any rectangle containing has the same A=core( , 1) Only option is choice of B=core( , 2) since the

stem will be fixed by

To count # of choices it suffices to show that B B’ is small for any rectangles R, R’ agreeing with

Page 18: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

18

New Goal: Bounding Symmetric Differences

For super-stems(1) and x, y agreeing with , show |core(x, 2) core(y, 2)| is small

…and the same with roles of 1 , 2 reversed

Page 19: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

19

How the cores can vary

v0

10

L1

L2

L5

v1

v2

v3

1, 2, restPath of x, Path of y,

Variables read outside 1 are the sameon x and y since all are set by

Only way i core(x, 2) core(y, 2)is if xi is read in 1 on input y but noton input x

Key: variables in the symmetricdifference are read more!

v4

Page 20: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

20

Using the access pattern to bound the core difference

Partition f-1(1) into classes depending on the access pattern of the input For xf-1(1) define patternx:[r] [n] given by

patternx(t) = # { i: |Layers(x,i)| = t } number of variables read in exactly t layers

For each class C will define 1 , 2 so that for all x in C, variables read in t layers will

account for almost all of core(x, 1), core(x, 2)

Variables in core(x, 2) core(y, 2) will be read in t layers on input either x or y

Page 21: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

21

More precise characterization

For any t, core(x, 2) core(y, 2) is contained

in G2(x,t)G2(y,t)H2(x,t)H2(y,t)

where iG2(z,t) iff icore(z, 2) but

|Layers(z,i)| t

iH2(z,t) iff |Layers(z,i) 2| t, |Layers(z,i) 1| 1, and

Layers(z,i) 1 2

Page 22: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

22

Recall method for choosing 1, 2

Distribution q for probability q 1/2 Pr[Li 1] = Pr[Li 2] = q

Pr[Li 1 2] = 1 2q

Independent for each i

Page 23: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

23

Choosing the probabilities

Claim: There is a set Q of 2k probabilities q,

each at least k16k, such that for almost all z,

there is an integer tt(z)k with

E[|G2(z,t) H2(z,t)|] E[|core(z,

2)|] for 1,2 chosen from q where q=q(z)Q

With these values E[|

core(z, 2)|] nqk n (k-16k)k n (k-16k2)1n

k c nn

log

logloFor this is

g

Page 24: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

24

Some issues Inputs x and y extending some super-stems(1) may not

have the same q and t We actually apply the above reasoning separately for

disjoint subsets Iq,t f-1(1) of inputs

We can bound |core(x, 2) core(y, 2)| relative to max{|core(x,2)|, |core(y,2)|} but need it in terms of |core(x,1)| |core(y, 1)|

Expectations of the cores of an input on 1 and 2 are the

same and concentration of core(z, 1) about its mean

says these are similar for x and y because core(x,1)

core(y, 1)

Page 25: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

25

Randomized Lower Bounds

Recall: once 1,2 are fixed we obtain the partition (1,2) of f-1(1) into embedded rectangles We only keep the good part of each partition

There are 2k choices of 1,2 that suffice to cover most of f-1(1) Each input in the good part of f-1(1) is contained in at

most 2k embedded rectangles Implies original error multiplied at most 2k times

when looking at embedded rectangles Works with initial error O(1/k)

Page 26: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

26

Proof of the Claim: Tailoring q to the access pattern to bound G2(z,t) and H2(z,t)

Let t=patternz(t) = # { i: |Layers(x,i)| = t }

Define (z,q) = t t qt

Note (z,q)=E[|core(z, 1)|]=E[|core(z, 2)|]

Let t(q) be the index of the largest term t qt in t t qt

Pick the smallest such index if there are ties Want to choose q so the term with index t(q) is

the rest

Page 27: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

27

(z,q) = t t qt

Let t(q) is non-increasing in q Decreasing q shifts weight away from larger terms

If q 1/(4k) then t(q) k Since t t = n it follows that t k t qt n qk+1

t t qt =E[|core(z, 1)|] n qk

First k terms add up to all but a q=1/(4k) fraction of

t t qt

One of the first k terms must be larger than all the other terms

Page 28: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

28

Choices of q

Q = { qb=k-8b : 1b 2k }

Since 1 t(qb) k and t(qb+1) t(qb) are integral, by PHP there must be a b such that t(qb+1) t(qb) t(qb-1) Set q(z)=qb

t(qb+1) t(qb) implies term terms with smaller t

t(qb) t(qb-1) implies term with larger t

This bounds G2(z,t) Bounding H2(z,t) a little trickier since accesses divided

between 1,2; forces at least a factor of k decrease between qb and qb+1

Page 29: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

29

What functions are this hard?

Computing xTMyx 0 (mod 2) for x {0,1}n, y {0,1}2n-1

Defined in [Ajtai 99b]

Given x {0,1}n, compute the parity of the number of (i,j) such that xi xj xi+j is true By reduction from previous problem [Ajtai 99b]

Element distinctness: Given x [n2]n determine whether or not all xi are distinct.

Page 30: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

30

Why ED doesn’t have large embedded rectangles

Let RA DA and RB DB have density more than 2-|A| and 2-|B| respectively

Then more than |D|/2 elements of D appear in RA and similarly for RB

Rectangle contains non-distinct input vector If |D|n2 then |ED-1(1)| |Dn|/e

Randomized bounds extend set-disjointness technique of [Babai-Frankl-Simon 86] n-2 error

Page 31: Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II

31

The end

Bounds for quadratic form based on rigidity argument [Ajtai 99b]

Given rigidity, randomized bounds follow from discrepancy argument using pairwise independence (Lindsay’s Lemma) [BSSV 00]

Open:better bounds, more functions