Post on 21-Jun-2015
description
The Power of Randomization
Example 1: Checking Equality
• Two large files at two different locations.
• Are they identical?– By communicating only a small amount of
information!
Checking EqualityThe Challenge
• Two large numbers N1 and N2 , n bits each
• Communication allowed: m<<n bits
• Possible?
Checking EqualityImpossibility
• Suppose the communication is based on N1 alone
• m<<n, – Two different N1’s will have the same m-bit
communication pattern– Switch N2 from one to another (YES->NO)
Checking Equality Randomized Algorithms
• Communicate N1 mod M for some number M
• If N1 = N2 then you always get YES
• If N1 != N2 then you get YES if M divides N1 - N2
Checking EqualityAnalysis
• Probability N1 != N2 but M divides N1 - N2 ?
• Probability over what?• M and not N1,N2
• Choose M at random in the range 1..2m
Checking EqualityAnalysis
• How many factors does N1 - N2 have?
– N1 - N2 <= 2n, so (2n)1/log n
• If we choose M randomly in the range 1..2 (2n)1/log
n
– Probability N1 != N2 but M divides N1 - N2 <= 1/2
– So m is ~ n/log n bits (minor gains)
Checking EqualityUse Prime Numbers
• How many prime factors does N1 - N2 have?
– N1 - N2 <= 2n, so 2n/log n
• If we choose M to be a random prime in 1..4n
– There are at least 4n/log 4n > 4n/log(4n) primes
– Probability N1 != N2 but M divides N1 - N2 <= ~ 1/2
– So m is ~ log n bits (major gains)
Checking EqualityThe Solution
• Two large numbers N1 and N2 , n bits each
• log n bits of communication – Remainder w.r.t random prime in range 1..4n
• Error Prob < 1/2
Checking EqualityReducing Error Prob
• Repeat k times
• Communication is klog n bits
• Error prob < (½)k
Checking EqualityExample Numbers
• 10GB file, n=1010
• Desired Error Prob 10-30
• Communication 99 * 33 = 3267 bits = 400 bytes
If 10 billion people do 10 billion checks a day, the prob that even one of the checks is erroneous is 1/10 billion
Another ExamplePCA
• Fit a line thru 0 to a collection of points so as to maximize sum of squares of projections
PCARandom Sampling
• Too many points?
• Pick a random sample– The fitting line doesn’t
change too much?
PCARandom Sampling
• How should you sample here?
PuzzleChecking Matrix Products
• Given three matrices A and BC, check if A=BC?– mod p for simplicity
• Matrices are n*n
• Easy to do in n3 time
• Can you do better?
PuzzleChecking Matrix Products
• Given three matrices A and BC, check if A=BC?
• Matrices are n*n
• Easy to do in n3 time
• Can you do better?