Advanced Algorithms Assignment

14

Click here to load reader

description

Advanced Algorithms Assignment

Transcript of Advanced Algorithms Assignment

Advanced Algorithms DesignProfessor Dr. Yangjun Chen

Student Name Ajay Ramganesh

Student Number 3064395

Assignment 1

Question-1:1.(25) Please prove the following equalities and inequalities:10n (n2).100n = (n).22n (2n).

Answer:10n (n2)Lets assume that 10n (n2). The 3 constant c1, c2, and n0 are defined as:c1 n2 10n c2 n2

Formula of -notation: (g(n)) = {f(n) : positive constants c1, c2, and n0, such that n n0, we have 0 c1g(n) f(n) c2g(n)}.

Left hand side (LHS) of equation is: c1 n2 10nc1 n 10 n 10/ c1.Equation only hold when n 10/c1. Equation will not hold when n > 10/c1.10n (n2). Hence Proved.

100n = (n)Let us assume equation is correct then there are 3 constants c1, c2, and n0.As per -notation formula:c1n 100n c2nLeft hand side of equation:Right hand side of equation:c1n 100n100n c2nc1 100100 c2For every value of c1 100 and 100 c2 we can find out the value of n.Eg: 99n 100n 101n. This equation holds for every value of n when c1 100 and 100 c2.Hence proved.

22n (2n).Let assume that 22n (2n). Then there will be 3 constants c1, c2, and n0.As per the formula: (g(n)) = {f(n) : positive constants c1, c2, and n0, such that n n0, we have 0 c1g(n) f(n) c2g(n)}.c12n 22n c22nWe select right hand side of the function to proceed with it. 22n c22n.2n c2n log c2 So, above formula only holds when n log c2. Equation will not hold when n > log c2.22n (2n) hence proved..

Question-2: (15) In Fig. 2, we show a network, in which each node stands for a page and each arc for a link from a page to another. Please give the transition matrix for the network. Also, explain why the solution to the equation:A = MA can be used as the estimation of page importance, where A is a vector of n variables and M is an n n transition matrix.

Answer:Transition Matrix: P1 P2 P3 P4 P5 0 0 0 0 0 0 1 0 0 M = 0 0 0 0 0 0 1 0 0 0

The web navigation for the above transition matrix can be expressed as random walker move. Let M has sxy entries in row x and column y, where:

1.sxy = 1/r if page y has a link to page x, and there are a totalr1 pages that y links to.2.sxy = 0 otherwise.

After a large number of moves, the walkers distribution of possible locations is the same at each step. To overcome this, the solution A = MA can be used as the estimation of page importance, where A is a vector of n variables and M is an n n transition matrix.

So, the time that the random walker spends at a page is used as the measurement of importance.

After 50 to 100 iterations of this process, the amount of time spent by the user on the particular page on Web will be exactly close to the above results. So, the equation A = MA helps in finding the amount of time the user spends on the page and this can be used as the estimation of page importance

Question-3: (10) Explain why the following equation (for estimate the importance of pages) works in the presence of spider traps and dead ends.Pnew = MPold + (1 - )TAnswer: When a user enters a set of pages where there is no link outside the set, its called Spider Trap. When a user enters a page where there is no link to the outside world, its called Dead End.

In both the above scenarios, the user gets stuck and the walk ends. If we apply the relaxation to the matrix of Web with Spider Traps, it can result in a limiting distribution where all probabilities outside a spider trap are 0.Limiting random walker is allowed to wander at random. By doing this, the walker follows a random out-link, with probability (normally, 0.8 0.9) and with probability 1 - (called the taxation rate), we remove that walker and deposit a new walker at a randomly chosen Web page.Using the above strategy,i. If walker gets stuck in Spider Trap, after few time steps, walker will disappear and replaced by a new walkerii. If the walker reaches a dead end and disappears, a new walker takes over shortly

Let Pnew and Pold be the new and old distributions of the location of the walker, after 1 iteration, we can express the relationship between them as following:

1 -

Pnew= 0.8 Transition Matrix M Pold + 0.2 Fraction Of Time Based on the above equation, if we multiply the transition matrix with the probability of 0.8, we can get the new location of the walker and with 0.2 probability we can start the walker from the random place that helps the walker to come out of the dead end or spider trap situation.This is the reason why Pnew= MPold+ (1 - )T is used to overcome dead end or spider traps because it helps in to move the walker out of the situation.

Question-4 (20) Fig. 3 shows a tree encoding. The quadruples can be stored as a sequence sorted by LeftPos values by using the depth-first search. Design an algorithm to transform it into another sequence sorted by RightPos values.

Answer:Algorithm: Let X(i) be all data streams sorted by LeftPos.Let R(i)be new data streams sorted by RightPos.

Beginrepeat until each X(i) becomes empty{identify i such that the first element v of X(i) is of the minimal LeftPos value; remove v from X(i);while Stack is not empty and Stack.top() is not v s ancestor do{d Stack.pop(); Let d = (j, u);put u at the end of R(i);}Stack.push(i, v);}Stack = Pop out all the remaining elements Insert into corresponding R(i);End

Question-5: (15)In the following table, we show the key words of five documents, as well as the key word sequences sorted by frequencies. Please construct a trie for the sorted sequences and a header table for all the key words to speed up the evaluation of conjunctive queries of form word1 word2 wordi. Also, show how a conjunctive query is evaluated by using the trie.

Answer:

Frequency of each word is found by the following:af(w) = No. of doc containing w No. of doc

Frequency of each word:af(f) = 4/5af(c) = 4/5af(a) =3/5af(b) =3/5af(i) = 3/5af(m) =2/5 af(p) = 2/5af(h) = 1/5af(j) =1/5

RootHeader TableItemsLinks

c

f

a

b

i

m

p

h

j

footcoot{1,2,4,5}mootpoothootiimootpjfoot {1,2,3,5}

bootboot{1,2,3,5}{2,3,4} {2,3,4}{1,2,5}aoot{2,3,4}i {1,2,5} {1,3,4} {1,3,4}

boot

aoot{1,3,4} {2,3,4} {1,5}{1,5}{1,5} {2} {1,5}{2} {1,5}{1}

Evaluation of query in trieThe following steps are used to evaluate the query in the trie: Let Q = word1 word2 wordkbe a query Sort increasingly the words in Q according to the appearancefrequency:

Wordi1 Wordik Find a node in the trie, which is labeled with wordi1 If the path from the root to wordi1 contains all wordj(j = 1, , k), Return the document identifiers associated with wordi The check can be done by searching the path bottom-up, starting from wordi1. In this process, we will first try to find wordi2 , and then wordi3, and so on.

Example: We have a query say: c b f The frequency of each query word:af(c) = 4/5 af(b) = 3/5 af(f) = 4/5After sorting the frequencies in increasing order we have the result as:b f c

RootHeader TableItemsLinks

c

f

a

b

i

m

p

h

j

footcoot

mootpootboothootaootibootiaootmootpjfoot

boot

i

Question-6: (20)The following is a directed graph G. Please find a spanning tree of it and then label the nodes in the spanning tree by intervals. Also, construct an interval sequence for each node, which can be used to check the reachability queries with respect G.

Answer:

Spanning Tree: a[0,13)

b[1,6) r[6,10)h[10,13)

c[2,5) d[5,6) e[7,10) i[11,12) j[12,13)

p[3,5) f[8,9) g[9,10)

k[4,5)

Topological order of nodes:

a[0,13)

b[1,6) r[6,10)h[10,13)

c[2,5) d[5,6) e[7,10) i[11,12) j[12,13)

p[3,5) f[8,9) g[9,10)

k[4,5)

Topological order: a, b, h, j, r, e, i, f, g, c, p, d, k

Reverse topological order: k, d, p, c, g, f, i, e, r, j, h, b, aL(k) = [4, 5)L(i) = [4,5)[5,6)[8,9)[11, 12)L(d) = [4,5) [5,6)L(e) = [7,10)L(p) = [3, 5)L(r) = [2,5)[6, 10)L(c) = [2, 5)L(j) = [2,5)[6,10)[12,13)L(g) = [4, 5)[5, 6)[9, 10)L(h) = [4,5)[5,6)[7,10)[10, 13)L(f) = [4, 5)[5, 6)[8, 9)L(b) = [1,6)L(a) = [0, 13)

Reachability Query Check: Let u and v be two nodes of G. u is a descendant of v, if and only if, there exists an interval [, ) in L(v) such that u [, ).

Example:[f, f ) = [4, 5)[5,6)[8,9)L(h) = [4,5)[5,6)[7,10)[10, 13)

ImpliesInterval of f is in the interval of h node f is the descendant of node h.

END

Advanced Algorithm Design Assignment 1