DM534 - Introduction to Computer Science, Week 48 Graph Theory

50
DM534 - Introduction to Computer Science, Week 48 Graph Theory Daniel Merkle [email protected] 1

Transcript of DM534 - Introduction to Computer Science, Week 48 Graph Theory

DM534- IntroductiontoComputerScience,Week48

GraphTheory

[email protected]

1

GraphTheory- Motivation

2

SocialNetworks

ThisgraphmightdepictFacebookfriendship relations,orTwitterfollower relations,or…3

ChemicalCompounds

IsomersofHexane4

MetabolicNetworks

MetabolicNetworkofE.coli.5

Whatisagraph?

Vertices: P,Q,R,S,TEdges: allthelinesDegree ofavertex: numberofedgeswiththatvertexasanend-point

6

Interpretation:

Thegraphfromthelastslidemightdepictthisroadmap.NotethattheintersectionofthelinesPSandQTisnotavertex,sinceitdoesnotcorrespondtoacross-roads

7

AnotherInterpretation:

IfP,Q,R,SandTrepresentfootballteams,thentheexistenceofanedgemightcorrespondtotheplayingofagamebetweentheteamsatitsend-points.Thus,teamPhasplayedagainstteamsQ,SandT,butnotagainstteamR.Inthisrepresentation,thedegreeofvertexisthenumberofgamesplayedbythecorrespondingteam.

8

Twodifferentgraphs?No!

Intherightgraphwehaveremovedthe'crossing'of thelinesPSandQTbydrawingthelinePSoutside therectanglePQST.Theresultinggraphstilltellsuswhetherthereisadirectroadfromoneintersection toanother,andwhichfootballteamshaveplayedwhich.Theonlyinformation wehavelostconcerns'metrical'properties,suchasthelengthofaroadandthestraightnessofawire.

9

Thefirstscientificarticleusingthetermgraph

10

DirectedGraphs(Digraphs)

Assumeagainagraphdepictsaroadmap.Thestudyofdirectedgraphs(ordigraphs,asweabbreviatethem)ariseswhenmaking theroadsintoone-waystreets.Anexampleofadigraphisgivenabove,thedirectionsoftheone-waystreetsbeingindicatedbyarrows.(Inthisexample,therewouldbechaosatT,butthatdoesnotstopusfromstudying suchsituations!)

11

Walks,Paths,andCycles

Muchofgraphtheory involves'walks'ofvariouskinds.Awalkisa'wayofgettingfromonevertextoanother',andconsistsofasequenceofedges,onefollowing afteranother.Forexample,intheabovefigure P—>Q—>Risawalkoflength2,andP—>S—>Q—>T—>S—>Risawalkoflength5.Awalkinwhichnovertexappearsmorethanonceiscalledapath;forexampleandP—>Q—>R—>S isapath.Awalkinwhichyouendwhereyoustarted,forexampleQ—>S—>T—>Q,iscalledacycle. 12

Connectedness

Somegraphsareintwoormoreparts.Forexample,consider thegraphwhoseverticesarethestationsof theCopenhagenMetroandtheNewYorkSubway,andwhoseedgesarethelinesjoining them.Itisimpossible totravelfromØsterport toGrandCentralStationusingonlyedgesofthisgraph,butifweconfineourattentiontotheCopenhagenMetroonly,thenwecantravelfromanystationtoanyother.Agraphthatisinonepiece,sothatanytwoverticesareconnectedbyapath,isaconnectedgraph;agraphinmorethanonepieceisadisconnectedgraph.

13

WeightedGraphs

Consider theabovegraph:itisaconnectedgraphinwhichanon-negativenumber isassigned toeachedge.Suchagraph iscalledaweightedgraph,andthenumberassignedtoeachedgeeistheweightofe,denotedbyw(e).Example:Suppose thatwehavea'map'of theformshownabove,inwhichthelettersAtoLrefertotownsthatareconnectedbyroads.Thentheweightsmaydenote thelengthoftheseroads.

14

ShortestPath(betweenone pairofvertices)

Whatisthelengthoftheshortestpath(=distance)fromAtoL?

Theproblem istofindapathfromAtoLwithminimum totalweight.Thisproblem iscalledtheShortestPathProblem.Notethat,ifwehaveaweightedgraphinwhicheachedgehasweight1,thentheproblemreducestothatof finding thenumberofedgesintheshortestpathfromAtoL.

15

All-PairsShortestPath

Whatisthelengthof theshortestpath(=distances)fromanyvertextoanyvertex?

Thisproblem iscalledtheAll-PairsShortestPathProblem

16

All-PairsShortestPath:ASolutionforSomeCitiesinAustralia

17

Oneofthemostdecorativetablesofdistances(inRomanmiles)betweenmajorEuropeancitiesprintedintheeighteenthcentury.Notonlywerethedataextremelyusefulfortravelingbutalsoforsendingaletter,becausedistance,notweight,determinedtheprice.

(Fromthe“HistoricMapsCollection”,PrincetonUniversityLibrary,link:here

http://libweb5.princeton.edu/visual_materials/maps/websites/

thematic-maps/introduction/introduction.html)18

MatrixRepresentationsforGraphs

IfGisagraphwithverticeslabelled {1,2,...},itsadjacencymatrixA isthenxnmatrixwhoseij-th entryisthenumberofedgesjoining vertexi andvertexj.Twonodesi and jareadjacentiftheij-th entry intheadjcacencymatrixislargerthan0.

If,inaddition tothevertices,theedgesarelabelled{1,2,...,m},itsincidencematrixMisthenxmmatrixwhoseij-th entryis1ifvertexi isincidenttoedgej and0otherwise.ThefigureaboveshowsalabelledgraphGwithitsadjacencyandincidencematrices. 19

AdjacencyMatrixforWeightedGraphs

GivenaweightedgraphG,theadjacencymatrixA isthematrixwhoseij-th entryistheweightoftheedgebetweenvertexi andvertexj.

20

Matrix-MatrixMultiplicationRecap

✓1 0 2 3�1 2 2 1

◆⇥

0

BB@

1 2 34 5 21 2 11 2 5

1

CCA =

✓ ◆

21

Matrix-MatrixMultiplicationRecap

✓1 0 2 3�1 2 2 1

◆⇥

0

BB@

1 2 34 5 21 2 11 2 5

1

CCA =

✓ ◆

22

Matrix-MatrixMultiplicationRecap

✓1 0 2 3�1 2 2 1

◆⇥

0

BB@

1 2 34 5 21 2 11 2 5

1

CCA =

✓6 12 2010 14 8

M ⇥N = R

rij =X

k

mik ⇤ nkj

23

Matrix-MatrixMultiplicationRecap

✓1 0 2 3�1 2 2 1

◆⇥

0

BB@

1 2 34 5 21 2 11 2 5

1

CCA =

✓6 12 2010 14 8

✓r00 r01 r02r10 r11 r12

◆ ✓r11 r12 r13r21 r22 r23

◆Zero-basedNumbering (“Zeroindexed”) One-basedNumbering (“Oneindexed”)

24

Zero-Indexing

(picturefromxkcd.com)

Zero-basednumberingisawayofnumbering inwhichtheinitialelementofasequenceisassigned theindex0,ratherthantheindex1asistypicalineverydaynon-mathematical/non-programming circumstances.

Makesurethatitisclearwhatyoumean,whenyousay,e.g.,the“rowwithindex1”inamatrix.

25

Matrix-MatrixMultiplicationinPython(forSquareMatrices)

Numberofadditionsperresult[i][j] entry: sizeNumberofmultiplicationsperresult[i][j] entry: sizeNumberofentriesintheresultmatrix: size x sizeOverallnumberofoperations (additionsandmultiplications): 2 x size x ( size x size )

Overallcomputationalruntime: O(size3)

ProvidedCode:matMult.py

26

Matrix-MatrixMultiplicationinPython

ProvidedCode:matMult.py

27

CommentstoPythonCode

• Creatingalistofthree0’s:

• Creatingalistoftwolistswiththree0’s(i.e.,a“matrix”ofsize2x3):

28

MatricesinPython:ImplementedasListsofLists:

“Matrix”dimensions:

M haslen(M)manyrowsandlen(M[0])manycolumnsN haslen(N)manyrowsandlen(N[0])manycolumnsTheresultneedstohavelen(M)manyrowsandlen(N[0])manycolumns

ProvidedCode:matMult.py

29

PowersoftheAdjacencyMatrix

A =

0

BBBBBB@

1 2 3 4 5 6

1 0 1 0 0 1 02 1 0 1 0 1 03 0 1 0 0 0 04 0 0 0 0 1 15 1 1 0 1 0 06 0 0 0 1 0 0

1

CCCCCCA

Ak= A⇥A . . .⇥A| {z }

k times

is called the k-th power of the adjacency matrix

30

A2 =

0

BBBBBB@

1 2 3 4 5 6

1 2 1 1 1 1 02 1 3 0 1 1 03 1 0 1 0 1 04 1 1 0 2 0 05 1 1 1 0 3 16 0 0 0 0 1 1

1

CCCCCCAA3 =

0

BBBBBB@

1 2 3 4 5 6

1 2 4 1 1 4 12 4 2 3 1 5 13 1 3 0 1 1 04 1 1 1 0 4 25 4 5 1 4 2 06 1 1 0 2 0 0

1

CCCCCCAA4 =

0

BBBBBB@

1 2 3 4 5 6

1 8 7 4 5 7 12 7 12 2 6 7 13 4 2 3 1 5 14 5 6 1 6 2 05 7 7 5 2 13 46 1 1 1 0 4 2

1

CCCCCCA

Theorem:

If G is a graph with adjacency matrix A, and vertices

with indices 1, . . . , n then for each positive integer k

the ij-th entry of Ak

is

the number of di↵erent walks using exactly k edges

from node i to node j

31

A2 =

0

BBBBBB@

1 2 3 4 5 6

1 2 1 1 1 1 02 1 3 0 1 1 03 1 0 1 0 1 04 1 1 0 2 0 05 1 1 1 0 3 16 0 0 0 0 1 1

1

CCCCCCAA3 =

0

BBBBBB@

1 2 3 4 5 6

1 2 4 1 1 4 12 4 2 3 1 5 13 1 3 0 1 1 04 1 1 1 0 4 25 4 5 1 4 2 06 1 1 0 2 0 0

1

CCCCCCAA4 =

0

BBBBBB@

1 2 3 4 5 6

1 8 7 4 5 7 12 7 12 2 6 7 13 4 2 3 1 5 14 5 6 1 6 2 05 7 7 5 2 13 46 1 1 1 0 4 2

1

CCCCCCA

Example:Considerthetwoverticeswithindex4and5in𝐴"

Length4walks:1) 4->5->1->2->52) 4->5->2->1->5

Thereare2walksoflength4.Furthermore,𝐴"#" =2.

A =

0

BBBBBB@

1 2 3 4 5 6

1 0 1 0 0 1 02 1 0 1 0 1 03 0 1 0 0 0 04 0 0 0 0 1 15 1 1 0 1 0 06 0 0 0 1 0 0

1

CCCCCCA

32

InPython3

ProvidedCode:adjacencyMatMult.py 33

Proof:(alsoonblackboard)Let G be a graph with adjacency matrix A, and vertices 1, . . . , n. We proceed

by induction on k to obtain the result.

Base Case:

Let k = 1. A1= A. aij is the number of edges from i to j, which is identical to

the number of walks of length 1 from i to j.

Inductive Step:

Assume true for a positive integer k. Let bij be the ij-th entry of Ak, and let

aij be the ij-th entry of A. By the inductive hypothesis bij is the number of

walks of length k from i to j. Consider the ij-th entry of Ak+1= A ⇥ Ak

, i.e,

Ak+1ij = ai1b1j + ai2b2j + . . . + ainbnj =

Pnk=1 aikbkj . Consider ai1b1j . This is

equal to the number of walks of length 1 from i to 1 times the number of walks

of length k from 1 to j. This is therefor equal to the number of walks of length

k + 1 from i to j, where 1 is the second vertex. This argument holds for each

vertex m, i.e., aimbmj is the number of walks from i to j in which m is the

second vertex. Therefore, the sum is the number of all possible walks from ito j. 34

AlgorithmforAll-PairsShortestPath

WeightedGraphGwithweightsonedges:

• Whatisthedistance(=lengthoftheshortestpath) betweenAandL?

17

Generalization:• Whatarethedistances of

ALLpaths(=lenghts ofALLshortestpaths)betweenallpairsofnodes?

…andhowcanwefindallthesedistances? 35

TheEdgeWeightMatrixW

Definition:

Wij =

8><

>:

the weight of the edge (i, j) if the edge (i, j) exists

0 if i = j

1 else

Example:

W =

0

BBBBBBBBB@

1 2 3 4 5 6

1 0 1 1 1 2 12 1 0 2 1 4 13 1 2 0 1 1 3

4 1 1 1 0 6 1

5 2 4 1 6 0 16 1 1 3 1 1 0

1

CCCCCCCCCA

Note:MatrixWhasentriescorresponding toinfinity,asitmightbeimpossible toreachvertexjfromvertexi via1 edge.

Weassumeallweightsarenotnegative,i.e.,largerorequalto0.

Interpretation:

Wij is the distance from vertex i to vertex j using maximally 1 edge

weightsaredepictedinred

36

AmodifiedMatrix-MatrixMultiplication

0

@1 0 21 2 43 1 2

1

A�

0

@1 2 34 5 21 2 5

1

A =

0

@2 3 22 3 43 4 3

1

A

M �N = R

Note:thisoperationisverysimilartothestandardmatrix-matrixmultiplication:however,forcomputationoftheij-th entrythemultiplication isreplacedbyaddition,andaddition isreplacedbytheminimum operation.

Definition:

rij = mink{mik + nkj}

Example:

r33 = min{3 + 3, 1 + 2, 2 + 5} = 3 37

Theorem:

If G is a weighted graph with edge weight matrix W ,

and vertices with indices 1, . . . , n then for each positive

integer k

the ij-th entry of W k= W �W � . . .�W| {z }

k timesis

the length of the shortest path from i to jusing maximally k edges

38

Examples:Considerthetwoverticeswithindex4and1in𝑊"

ShortestPathusingmaximally4 edges:4->6->3->2->1(distance7)

W 2 =

0

BBBBBBBBB@

1 2 3 4 5 6

1 0 1 3 8 2 12 1 0 2 10 3 5

3 3 2 0 4 6 3

4 8 10 4 0 6 1

5 2 3 6 6 0 7

6 1 5 3 1 7 0

1

CCCCCCCCCA

W 3 =

0

BBBBBBBBB@

1 2 3 4 5 6

1 0 1 3 8 2 6

2 1 0 2 6 3 5

3 3 2 0 4 5 3

4 8 6 4 0 6 1

5 2 3 5 6 0 7

6 6 5 3 1 7 0

1

CCCCCCCCCA

W 4 =

0

BBBBBBBBB@

1 2 3 4 5 6

1 0 1 3 7 2 6

2 1 0 2 6 3 5

3 3 2 0 4 5 3

4 7 6 4 0 6 1

5 2 3 5 6 0 7

6 6 5 3 1 7 0

1

CCCCCCCCCA

Considerthetwoverticeswithindex5and3in𝑊"

ShortestPathusingmaximally4 edges:5->1->2->3(distance5)

W =

0

BBBBBBBBB@

1 2 3 4 5 6

1 0 1 1 1 2 12 1 0 2 1 4 13 1 2 0 1 1 3

4 1 1 1 0 6 1

5 2 4 1 6 0 16 1 1 3 1 1 0

1

CCCCCCCCCA

39

Matrix-MatrixMultiplicationinPython(forSquareMatrices)

40

ModifiedMatrix-MatrixMultiplicationinPython(forSquareMatrices)

StandardMatrix-MatrixMultiplication:

ProvidedCode:shortestPaths.py 41

InPython3

ProvidedCode:shortestPaths.pyNote:python3 requiredbecauseofinf

42

W 2 =

0

BBBBBBBBB@

1 2 3 4 5 6

1 0 1 3 8 2 12 1 0 2 10 3 5

3 3 2 0 4 6 3

4 8 10 4 0 6 1

5 2 3 6 6 0 7

6 1 5 3 1 7 0

1

CCCCCCCCCA

W 3 =

0

BBBBBBBBB@

1 2 3 4 5 6

1 0 1 3 8 2 6

2 1 0 2 6 3 5

3 3 2 0 4 5 3

4 8 6 4 0 6 1

5 2 3 5 6 0 7

6 6 5 3 1 7 0

1

CCCCCCCCCA

W 4 =

0

BBBBBBBBB@

1 2 3 4 5 6

1 0 1 3 7 2 6

2 1 0 2 6 3 5

3 3 2 0 4 5 3

4 7 6 4 0 6 1

5 2 3 5 6 0 7

6 6 5 3 1 7 0

1

CCCCCCCCCA

W 5 =

0

BBBBBBBBB@

1 2 3 4 5 6

1 0 1 3 7 2 6

2 1 0 2 6 3 5

3 3 2 0 4 5 3

4 7 6 4 0 6 1

5 2 3 5 6 0 7

6 6 5 3 1 7 0

1

CCCCCCCCCA

W =

0

BBBBBBBBB@

1 2 3 4 5 6

1 0 1 1 1 2 12 1 0 2 1 4 13 1 2 0 1 1 3

4 1 1 1 0 6 1

5 2 4 1 6 0 16 1 1 3 1 1 0

1

CCCCCCCCCA

W 6=W 2 6=W 3 6=W 4=W 5=W 6= . . .

Assumealledgeweightsarenotnegative.Thenumber ofedgesneededforashortestpathcanmaximallyben-1,wherenisthenumberofverticesinthegraph.Ifthepathwouldgovianedges,thenyouwouldhavetovisitatleastonevertextwice,butthenthepathcannotbeashortestpathanymore.Obviously𝑊% = 𝑊'() forallk>n-1.

Answer: n� 1 (which is identical to |V |� 1)

Which value of k is necessary, in order to have W k

contain all the pairwise distances of all vertexes?

43

Lemma:If G is a weighted graph with edge weight matrix W ,

and vertices with indices 1, . . . , n then

the ij-th entry of Wn�1= W �W � . . .�W| {z }

n � 1 times

is

the distance from i to j

D := Wn�1is called the distance matrix of the graph G.

44

ComputationoftheDistanceMatrixbyRepeatedSquaring

Wn�1 =

0

BBBBBBBBBBBBBBBB@

0

BBBBBBBBBBB@

0

BBBBBB@

0

B@(W �W )| {z }W 2

�W

1

CA

| {z }W 3

�W

1

CCCCCCA

| {z }W 4

�W

1

CCCCCCCCCCCA

| {z }W 5

� . . .�W

1

CCCCCCCCCCCCCCCCA

| {z }Wn�1

W (2k) =

0

BBBBBBBBBBB@

0

BBBBBBBBBBB@

0

BBBBBB@

0

B@(W �W )| {z }W 2

1

CA

2

| {z }W 4

1

CCCCCCA

2

| {z }W 8

1

CCCCCCCCCCCA

2

. . .

1

CCCCCCCCCCCA

2

| {z }W (2k)

n-2matrix-matrixmultiplicationareneeded inordertocomputethedistancematrix𝐷 = 𝑊'()

kmatrix-matrixmultiplicationareneeded (namelysquaringamatrixktimes)inordertocomputethematrix

2%hastobelargerorequalton-1,orequivalently,khastobelargerorequaltolog/(𝑛 − 1)

W (2k)

Example:ConsideragraphGwith101vertices.InordertocomputethedistancematrixD = 𝑊)66 ,theleftapproachneedstomake99matrix-matrixmultiplications.Therightapproach(calledrepeatedsquaring)requiresonly7matrix-matrixmultiplications,as27 = 128,andD = 𝑊)/9 = 𝑊)66

45

TestinPython3

Note:ceil(log2(size-1))returnsthesmallestintegerlargerorequaltolog2(size-1),i.e.,Rwillbethedistancematrixafterthisfor loop.

ProvidedCode:timing.py 46

ThemostobviousApplicationofComputingtheDistanceMatrix:

47

AnotherApplicationoftheDistanceMatrix:PredictingBoilingPointsofParaffins

In1947HarryWienerdefinedtheWiener-Index ofagraphGinordertopredicttheboilingpointofdifferentparaffins.HeusedthegraphrepresentationGofthecarbonbackboneofamoleculewithn carbonatomsandcalculatedtheWiener-Indexthesumofalldistancesbetweenallpairsofvertexes,i.e.

Hepredictedtheboilingpoint𝑡; tobe

W(G) =1

2

nX

i=1

nX

j=1

Dij

tB = t0 �✓98

n2(w0 �W(G)) + 5.5 · (p0 � p)

with t0 = 745.42 · log10(n+ 4.4)� 689.4

w0 =

1

6

· (n+ 1) · n · (n� 1)

p0 = n� 3

p = the number of shortest paths i ! . . . ! j of length 3 in G with i < j

= half of the number of entries ”3” in the distance matrix D 48

WienerIndex:BoilingPointPrediction,Example(2,2-dimethylbutan)

14 6

52

3 W =

0

BBBBBBBBB@

1 2 3 4 5 6

1 0 1 1 1 1 12 1 0 1 1 1 13 1 1 0 1 1 14 1 1 1 0 1 1

5 1 1 1 1 0 16 1 1 1 1 1 0

1

CCCCCCCCCA

D = Wn�1 =

0

BBBBBBBBB@

1 2 3 4 5 6

1 0 2 1 2 2 3

2 2 0 1 2 2 3

3 1 1 0 1 1 2

4 2 2 1 0 2 1

5 2 2 1 2 0 3

6 3 3 2 1 3 0

1

CCCCCCCCCA

W(G) =1

2

nX

i=1

nX

j=1

Dij = 28

t0 = 68.72

w0 =1

6· 5 · 6 · 7 = 35

p0 = 6� 3 = 3

p = 3

tB = t0 �✓98

n2(w0 �W(G)) + 5.5 · (p0 � p)

= 68.72� 98

36(35� 28) + 5.5 · (3� 3)

= 49.66

Thechemicalcompound

ThecarbonbackboneGraphG EdgeWeightMatrix

DistanceMatrixCalculationofWienerIndexandotherparameters,aswellastheresultingboiling pointprediction.

Note:Dependingonhowyouchosetolabelyourgraph,theedgeweightmatrixmightlookdifferent.Thiswon’tmatterforthesubsequent calculations.

49

WienerIndex:BoilingPointPrediction,Example(2,2-dimethylbutan)

Predicted Boiling Point: tB = 49.66

Real Boiling Point: trealB ⇡ 49.7� 50.0

Thepredictionofboilingpointsofparaffins basedontheWiener-Indexofthecorrespondingmoleculargraphisamazinglyaccurate.Tryityourself (seeexercises)!Intuitively,theWiener-Indexquantifies the“compactness”ofagraph(ormolecule). LongsinglechainedmoleculeswithncarbonshaveasmallerWiener-Indexthanmoleculesthatcontainmanybranches.Longmoleculesareeasiertobreak,andhaveusuallyalowerboilingpoint.

50