On Measuring the Complexity of Networks: Kolmogorov Complexity...

Research ArticleOn Measuring the Complexity of Networks KolmogorovComplexity versus Entropy

MikoBaj Morzy1 Tomasz Kajdanowicz2 and PrzemysBaw Kazienko2

1 Institute of Computing Science Poznan University of Technology Piotrowo 2 60-965 Poznan Poland2Department of Computational Intelligence ENGINE-The European Centre for Data Science Wrocław University ofScience and Technology Wybrzeze Wyspianskiego 27 50-370 Wrocław Poland

Correspondence should be addressed to Mikołaj Morzy mikolajmorzyputpoznanpl

Received 6 April 2017 Revised 27 July 2017 Accepted 13 August 2017 Published 1 November 2017

Academic Editor Pasquale De Meo

Copyright copy 2017 Mikołaj Morzy et alThis is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

One of the most popular methods of estimating the complexity of networks is to measure the entropy of network invariants suchas adjacency matrices or degree sequences Unfortunately entropy and all entropy-based information-theoretic measures haveseveral vulnerabilities These measures neither are independent of a particular representation of the network nor can capture theproperties of the generative process which produces the network Instead we advocate the use of the algorithmic entropy as thebasis for complexity definition for networks Algorithmic entropy (also known as Kolmogorov complexity or 119870-complexity forshort) evaluates the complexity of the description required for a lossless recreation of the network This measure is not affected bya particular choice of network features and it does not depend on the method of network representation We perform experimentson Shannon entropy and 119870-complexity for gradually evolving networks The results of these experiments point to 119870-complexityas the more robust and reliable measure of network complexity The original contribution of the paper includes the introductionof several new entropy-deceiving networks and the empirical comparison of entropy and119870-complexity as fundamental quantitiesfor constructing complexity measures for networks

1 Introduction

Networks are becoming increasingly more important in con-temporary information science due to the fact that they pro-vide a holistic model for representing many real-world phe-nomena The abundance of data on interactions within com-plex systems allows network science to describe model sim-ulate and predict behaviors and states of such complex sys-tems It is thus important to characterize networks in termsof their complexity in order to adjust analytical methods toparticular networks The measure of network complexity isessential for numerous applications For instance the levelof network complexity can determine the course of variousprocesses happeningwithin the network such as informationdiffusion failure propagation actions related to control orresilience preservation Network complexity has been suc-cessfully used to investigate the structure of software libraries[1] to compute the properties of chemical structures [2]

to assess the quality of business processes [3ndash5] and toprovide general characterizations of networks [6 7]

Complex networks are ubiquitous in many areas ofscience such as mathematics biology chemistry systemsengineering physics sociology and computer science toname a few Yet the very notion of network complexitylacks a strict and agreed-upon definition In general anetwork is considered ldquocomplexrdquo if it exhibits many fea-tures such as small diameter high clustering coefficientanticorrelation of node degrees presence of network motifsand modularity structures [8] These features are commonin real-world networks but they rarely appear in artificialrandom networks Finding a good metric with which onecan estimate the complexity of a network is not a trivialtask A good complexity measure should not depend solelyon the number of vertices and edges but it must take intoconsideration topological characteristics of the network Inaddition complexity is not synonymous with randomness

HindawiComplexityVolume 2017 Article ID 3250301 12 pageshttpsdoiorg10115520173250301

2 Complexity

or unexpectedness As has been pointed out [8] withinthe spectrum of possible networks from the most ordered(cliques paths and stars) to the most disordered (randomnetworks) complex networks occupy the very center ofthis spectrum Finally a good complexity measure shouldnot depend on a particular network representation andshould yield consistent results for various representationsof the same network (adjacency matrix Laplacian matrixand degree sequence) Unfortunately as current researchsuggests finding a good complexity measure applicable to awide variety of networks is very challenging [9ndash11]

Among many possible measures which can be used todefine the complexity of networks the entropy of variousnetwork invariants has been by far the most popular choiceNetwork invariants considered for defining entropy-basedcomplexity measures include number of vertices number ofneighbors number of neighbors at a given distance [12] dis-tance between vertices [13] energy of network matrices suchas Randic matrix [14] or Laplacian matrix [15] and degreesequences There are multiple definitions of entropies usu-ally broadly categorized into three families thermodynamicentropies statistical entropies and information-theoreticentropies In the field of computer science information-theoretic measures are the most prevalent and they includeShannon entropy [16] Kolmogorov-Sinai entropy [17] andRenyi entropy [18] These entropies are based on the conceptof the information content of a system and they mea-sure the amount of information required to transmit thedescription of an object The underlying assumption ofusing information-theoretic definitions of entropy is thatuncertainty (as measured by entropy) is a nondecreasingfunction of the amount of available information In otherwords systems in which little information is available arecharacterized by low entropy and therefore are consideredto be ldquosimplerdquo The first idea to use entropy to quantify thecomplexity of networks comes fromMowshowitz [19]

Despite the ubiquitousness of general-purpose entropydefinitions many researchers have developed specializedentropy definitions aimed at describing the structure ofnetworks [10] Notable examples of such definitions includethe proposal by Ji et al to measure the unexpectedness of aparticular network by comparing it to the number of possiblenetwork configurations available for a given set of parameters[20] This concept is clearly inspired by algorithmic entropywhich defines the complexity of a system not in terms of itsinformation content but in terms of its generative processA different approach to measure the entropy of networks hasbeen introduced by Dehmer under the form of informationfunctional [21] Information functional can be also used toquantify network entropy in terms of 119896-neighborhoods ofvertices [12 13] or independent sets of vertices [22] Yetanother approach to network entropy has been proposed byKorner who advocates the use of stable sets of vertices as thebasis to compute network entropy [23] Several comprehen-sive surveys of network entropy applications are also available[9 11]

Within the realm of information science the complexityof a system is most often associated with the number of

possible interactions between elements of the system Com-plex systems evolve over time they are sensitive to evenminor perturbations at the initial steps of development andoften involve nontrivial relationships between constituentelements Systems exhibiting high degree of interconnect-edness in their structure andor behavior are commonlythought to be difficult to describe and predict and as aconsequence such systems are considered to be ldquocomplexrdquoAnother possible interpretation of the term ldquocomplexrdquo relatesto the size of the system In the case of networks one mightconsider to use the number of vertices and edges to estimatethe complexity of a network However the size of the networkis not a good indicator of its complexity because networkswhich have well-defined structures and behaviors are ingeneral computationally simple

In this work we do not introduce a new complex-ity measure or propose new informational functional andnetwork invariants on which an entropy-based complexitymeasure could be defined Rather we follow the observationsformulated in [24] andwe present the criticism of the entropyas the guiding principle of complexity measure construc-tion Thus we do not use any specific formal definitionof complexity but we provide additional arguments whyentropy may be easily deceived when trying to evaluatethe complexity of a network Our main hypothesis is thatalgorithmic entropy also known as Kolmogorov complexityis superior to traditional Shannon entropy due to the factthat algorithmic entropy is more robust less dependent onthe network representation and better aligned with intuitivehuman understanding of complexity

The organization of the paper is the following In Sec-tion 2 we introduce basic definitions related to entropy andwe formulate arguments against the use of entropy as thecomplexity measure of networks Section 23 presents severalexamples of entropy-deceiving networks which provide bothmotivation and anecdotal evidence for our hypothesis InSection 3 we introduce Kolmogorov complexity andwe showhow this measure can be applied to networks despite itshigh computational cost The results of the experimentalcomparison of entropy and Kolmogorov complexity arepresented in Section 4The paper concludes in Section 5 witha brief summary and future work agenda

2 Entropy as the Measure ofNetwork Complexity

21 Basic Definitions Let us introduce basic definitions andnotation used throughout the remainder of this paper Anetwork is an ordered pair119866 = ⟨119881 119864⟩ where119881 = V1 V119873is the set of vertices and 119864 = (V119894 V119895) isin 119881 times 119881 is the setof edges The degree 119889(V119894) of the vertex V119894 is the number ofvertices adjacent to it 119889(V119894) = |V119895 (V119894 V119895) isin 119864| A givennetwork can be represented inmany ways for instance usingan adjacency matrix defined as

119860119873times119873 [119894 119895] = 1 if (V119894 V119895) isin 1198640 otherwise (1)

Complexity 3

An alternative to the adjacency matrix is the Laplacianmatrix of the network defined as

119871119873times119873 [119894 119895] =

119889 (V119894) if 119894 = 119895minus1 if 119894 = 119895 (V119894 V119895) isin 1198640 otherwise

(2)

Other popular representations of networks include thedegree list defined as 119863 = ⟨119889(V1) 119889(V2) 119889(V119899)⟩ and thedegree distribution defined as

119875 (119889119894) = 119901 (119889 (V119895) = 119889119894) =10038161003816100381610038161003816V119895 isin 119881 119889 (V119895) = 11988911989410038161003816100381610038161003816119873 (3)

Although there are numerous different definitions ofentropy in this work we are focusing on the definitionmost commonly used in information sciences the Shan-non entropy [16] This measure represents the amount ofinformation required to provide the statistical description ofthe network Given any discrete random variable 119883 with 119899possible outcomes the Shannon entropy119867(119883) of the variable119883 is defined as the function of the probability 119901 of alloutcomes of119883

119867(119883) = minus 119899sum119894=1

119901 (119909119894) log119887 119901 (119909119894) (4)

Depending on the selected base of the logarithm theentropy is expressed in bits (119887 = 2) nats (119887 = 119890) or dits (119887 =10) (bits are also known as Shannon and dits are also knownas Hartley) The above definition applies to discrete randomvariables for random variables with continuous probabilitydistributions differential entropy is used usually along withthe limiting density of discrete points Given a variable119883with119899 possible discrete outcomes such that in the limit 119899 rarr infinthe density of 119883 approaches the invariant measure 119898(119909) thecontinuous entropy is given by

lim119899rarrinfin

119867(119883) = minusint119901 (119909) 119901 (119909)119898 (119909) 119889119909 (5)

In thiswork we are interested inmeasuring the entropy ofvarious network invariants These invariants can be regardedas discrete random variables with the number of possibleoutcomes bound by the size of the available alphabet eitherbinary (in the case of adjacency matrices) or decimal (inthe case of other invariants) Consider the 3-regular graphpresented in Figure 1 This graph can be described using thefollowing adjacency matrix

11986010times10 =

[[[[[[[[[[[[[[[[[[[[[[[

0 0 1 0 1 0 0 0 1 00 0 0 0 1 0 0 1 1 01 0 0 1 0 1 0 0 0 00 0 1 0 0 1 1 0 0 01 1 0 0 0 0 0 0 0 10 0 1 1 0 0 0 1 0 00 0 0 1 0 0 0 1 0 10 1 0 0 0 1 1 0 0 01 1 0 0 0 0 0 0 0 10 0 0 0 1 0 1 0 1 0

]]]]]]]]]]]]]]]]]]]]]]]

(6)

1

2

3

4

5

6

7

89

10

Figure 1 Three-regular graph with 10 vertices

This matrix in turn can be flattened to a vector (eitherrow-wise or column-wise) and this vector can be treatedas a random variable with two possible outcomes 0 and 1Counting the number of occurrences of these outcomes wearrive at the random variable 119883 = 1199090 = 07 1199091 = 03 andits entropy 119867(119883) = 088 Alternatively this graph can bedescribed using the degree list 119863 = ⟨3 3 3 3 3 3 3 3 3 3⟩which can be treated as the random variable with the entropy119867(119863) = 0 Yet another possible random variable that can bederived from this graph is the degree distribution119875119863 = 1198890 =0 1198891 = 0 1198892 = 0 1198893 = 1 with the entropy 119867(119875119863) = 0In summary any network invariant can be used to extract arandom variable and compute its entropy

Thus in the remainder of the paper whenever mention-ing entropy we will refer to the entropy of a discrete randomvariable In general the higher the randomness the greaterthe entropy The value of entropy is maximal for a randomvariable with the uniform distribution and the minimumvalue of entropy is attained by a constant random variableThis kind of entropy will be further explored in this paper inorder to reveal its weaknesses

As an alternative to Shannon entropy we advocate the useof Kolmogorov complexity We postpone the discussion ofKolmogorov complexity to Section 3 where we provide bothits definition and the method to approximate this incom-putable measure For the sake of brevity in the remainder ofthis paper we will use the term ldquoentropyrdquo to refer to Shannonentropy and the term ldquo119870-complexityrdquo to refer to Kolmogorovcomplexity

22 Why Is Entropy a Bad Measure of Network ComplexityZenil et al [24] argue that entropy is not appropriate tomeasure the true complexity of a network and they presentseveral examples of networks which should not qualify ascomplex (using the colloquial understanding of the term)yet which attain maximum entropy of various networkinvariants We follow the line of argumentation of Zenilet al and we present more examples of entropy-deceivingnetworks Our main aim is to show that it is relatively easy

4 Complexity

Figure 2 Block network composed of eight of the same 3-nodeblocks

to construct a network which achieves high values of entropyof various network invariants Examples presented in thissection outline the main problem with using entropy asthe basis for complexity measure construction namely thatentropy is not aligned with intuitive human understanding ofcomplexity Statistical randomness as measured by entropydoes not imply complexity in a useful operational way

The main reason why entropy and other entropy-relatedinformation-theoretic measures fail to correctly describe thecomplexity of a network is the fact that these measures arenot independent of the network representation As amatter offact this remark applies equally to all computablemeasures ofnetwork complexity It is quite easy to present examples of twoequivalent lossless descriptions of the same network havingvery different entropy values aswewill show in Section 23 Inthis paper we experiment with four different representationsof networks adjacency matrices Laplacian matrices degreelists and degree distributions We show empirically that thechoice of a particular representation of the network stronglyinfluences the resulting entropy estimation

Another property which makes entropy a questionablemeasure of network complexity is the fact that entropy cannotbe applied to several network features at the same time but itoperates on a single feature for example degree and between-ness In theory one could devise a function which wouldbe a composition of individual features but high complexityof the composition does not imply high complexity of allits components and vice versa This requirement to select aparticular feature and compute its probability distributiondisqualifies entropy as a universal and independent measureof complexity

In addition an often forgotten aspect of entropy is thefact that measuring entropy requires making an arbitrarychoice regarding the aggregation level of the variable forwhich entropy is computed Consider the network presentedin Figure 2 At the first glance this network seems to be fairlyrandom The density of the network is 035 and its entropycomputed over adjacency matrix is 092 bits However this

network has been generated using a very simple procedureWe begin with the initial matrix

1198723times3 = [[[0 1 01 0 10 0 1

]]] (7)

Next we create 64 copies of this matrix and each ofthese copies is randomly transposed Finally we bind allthese matrices together to form a square matrix11987224times24 andwe use it as the adjacency matrix to create the networkSo if we were to coalesce the adjacency matrix into 3 times 3blocks the entropy of the adjacency matrix would be 0since all constituent blocks are the same It would mean thatthe network is actually deterministic and its complexity isminimal On the other hand it should be noted that thisshortcoming of entropy can be circumvented by using theentropy rate (119899-gram entropy) instead because entropy ratecalculates the entropy for all possible levels of granularity ofa variable Given a random variable 119883 = ⟨1199091 1199092 119909119899⟩let 119901(119909119894 119909119894+1 119909119894+119897) denote the joint probability over 119897consecutive values of119883 Entropy rate119867119897(119883) of a sequence of119897 consecutive values of119883 is defined as

119867119897 (119883)= minus sum1199091isin119883

sdot sdot sdot sum119909119897isin119883

119901 (1199091 119909119897) log2 119901 (1199091 119909119897) (8)

Entropy rate of the variable 119883 is simply the limit of theabove estimation for 119897 rarr infin

23 Entropy-Deceiving Networks In this section we presentfour different examples of entropy-deceiving networks sim-ilar to the idea coined in [24] Each of these networks hasa simple generative procedure and should not (intuitively)be treated as complex However if the entropy was used toconstruct a complexity measure these networks would havebeen qualified as complexThe examples given in this sectiondisregard any specific definition of complexity their aim isto outline main shortcomings of entropy as the basis for anycomplexity measure construction

231 Degree Sequence Network Degree sequence networkis an example of a network which has an interesting prop-erty there are exactly two vertices for each degree value1 2 1198732119873 = |119881|

The procedure to generate degree sequence network isvery simple First we create a linked list of all119873 vertices forwhich 119889(V1) = 119889(V119873) = 1 and forall119894 = 1 119894 = 119873 119889(V119894) = 2 It isa circle without one edge (V1 V119873) Next starting with vertexV3 we follow a simple rule

for 119894 = 3 to 1198732 dofor 119895 = 1 to (119894 minus 2) doadd edge(V119894 V1198732+119895)

end forend for

Complexity 5

Degree123

456

789

101112

131415

161718

19

Figure 3 Degree sequence network

The resulting network is presented in Figure 3 It isvery regular with a uniform distribution of vertex degreesdue to its straightforward generation procedure Howeverif one would examine the entropy of the degree sequencethis entropy would be maximal for a given number 119873 ofvertices suggesting far greater randomness of such networkThis example shows that entropy of the degree sequence (andthe entropy of the degree distribution) can be verymisleadingwhen trying to evaluate the true complexity of a network

232 Copeland-Erdos Network The Copeland-Erdos net-work is a network which seems to be completely randomdespite the fact that the procedure of its generation isdeterministic The Copeland-Erdos constant is a constantwhich is produced by concatenating ldquo0rdquo with the sequenceof consecutive prime numbers [25] When prime numbersare expressed in base 10 the Copeland-Erdos constant isa normal number that is its infinite sequence of digits isuniformly distributed (the normality of the Copeland-Erdosconstant in bases other than 10 is not proven)This fact allowsus to devise the following simple generative procedure for anetwork Given the number of vertices 119873 take the first 1198732digits of the Copeland-Erdos constant and represent them asthe matrix of the size119873times119873 Next binarize each value in thematrix using the function 119891(119909) = 119909div5 (integer division)and use it as the adjacency matrix to create a networkSince each digit in the matrix is approximately equally likelythe resulting binary matrix will have approximately thesame number of 0rsquos and 1rsquos An example of the Copeland-Erdos network is presented in Figure 4 The entropy ofthe adjacency matrix is maximal for a given number of 119873vertices furthermore the network may seem to be randomand complex but its generative procedure as we can see isvery simple

Degree101112

131415

161718

Figure 4 Copeland-Erdos network

Figure 5 2-Clique network

233 2-Clique Network 2-Clique network is an artificialexample of a network in which the entropy of the adjacencymatrix is maximal The procedure to generate this networkis as follows We begin with two connected vertices labeledred and blue We add red and blue vertices alternatinglyeach time connecting the newly added vertex with all othervertices of the same color As a result two cliques appear (seeFigure 5) Since there are asmany red vertices as there are bluevertices the adjacency matrix contains the same number of0rsquos and 1rsquos (not considering the 1 representing the bridge edgebetween cliques) So entropy of the adjacency matrix is closeto maximal although the structure of the network is trivial

234 Ouroboros Network Ouroboros (Ouroboros is anancient symbol of a serpent eating its own tail appearingfirst in Egyptian iconography and then gaining notoriety inlater magical traditions) network is another example of anentropy-deceiving network The procedure to generate this

6 Complexity

network is very simple for a given number119873 of vertices wecreate two closed rings each consisting of 1198732 vertices andwe connect corresponding vertices of the two rings Finallywe break a single edge in one ring and we put a single vertexat the end of the broken edge The result of this procedurecan be seen in Figure 6 Interestingly even though almost allvertices in this network have equal degree of 3 each vertex hasdifferent betweenness Thus the entropy of the betweennesssequence is maximal suggesting a very complex pattern ofcommunication pathways though the network Obviouslythis network is very simple from the communication pointof view and should not be considered complex

3 119870-Complexity as the Measure ofNetwork Complexity

We strongly believe that Kolmogorov complexity (119870-complexity) is a much more reliable and robust basis forconstructing the complexity measure for compound objectssuch as networks Although inherently incomputable119870-complexity can be easily approximated to a degreewhich allows for the practical use of 119870-complexity inreal-world applications for instance in machine learning[26 27] computer network management [28] and generalcomputation theory (proving lower bounds of various Turingmachines combinatorics formal languages and inductiveinference) [29]

Let us now introduce the formal framework for 119870-complexity and its approximation Note that entropy isdefined for any random variable whereas 119870-complexity isdefined for strings of characters only 119870-complexity 119870119879(119904) ofa string 119904 is formally defined as

119870119879 (119904) = min |119875| 119879 (119875) = 119904 (9)

where 119875 is a program which produces the string 119904 when runon a universal Turing machine 119879 and |119875| is the length of theprogram119875 that is the number of bits required to represent119875Unfortunately 119870-complexity is incomputable [30] or moreprecisely it is upper semicomputable (only the upper boundof the value of 119870-complexity can be computed for a givenstring 119904) One way for approximating the true value of 119870119879(119904)is to use the notion of algorithmic probability introducedby Solomonoff and Levin [31 32] Algorithmic probability119901119886(119904) of a string 119904 is defined as the expected probability that arandom program119875 running on a universal Turingmachine119879with the binary alphabet produces the string 119904 upon halting

119901119886 (119904) = sum119875119879(119875)=119904

12|119875| (10)

Of course there are 2|119875| possible programs of the length|119875| and the summation is performed over all possible pro-gramswithout limiting their lengthwhichmakes algorithmicprobability 119901119886(119904) a semimeasure which itself is incomputableNevertheless algorithmic probability can be used to calculate119870-complexity using the Coding Theorem [31] which states

that algorithmic probability approximates 119870-complexity upto a constant 119888

1003816100381610038161003816minuslog2 119901119886 (119904) minus 119870119879 (119904)1003816100381610038161003816 le 119888 (11)

The consequence of the Coding Theorem is that itassociates the frequency of occurrence of the string 119904 withits complexity In other words if a particular string 119904 canbe generated by many different programs it is consideredldquosimplerdquo On the other hand if a very specific program isrequired to produce the given string 119904 this string can beregarded as ldquocomplexrdquo The Coding Theorem also impliesthat 119870-complexity of a string 119904 can be approximated from itsfrequency using the formula

119870119879 (119904) asymp minuslog2 119901119886 (119904) (12)

This formula has inspired the Algorithmic Nature Labgroup (httpswwwalgorithmicnaturelaborg) to develop theCTM (Coding Theorem Method) a method to approximate119870-complexity by counting output frequencies of small Turingmachines Clearly algorithmic probability of the string 119904cannot be computed exactly because the formula for algo-rithmic probability requires finding all possible programsthat produce the string 119904 Nonetheless for a limited subsetof Turing machines it is possible to count the number ofmachines that produce the given string 119904 and this is the trickbehind the CTM In broad terms the CTM for a string 119904consists in computing the following function

CTM (119904) = 119863 (119899119898 119904)= |119879 isin T (119899119898) 119879 (119875) = 119904||119879 isin T (119899119898) 119879 (119875) halts|

(13)

where T(119899119898) is the space of all universal Turing machineswith 119899 states and 119898 symbols Function 119863(119899119898 119904) computesthe ratio of all halting machines with 119899 states and119898 symbolswhich produce the string 119904 and its value is determined withthe help of known values of the famous Busy Beaver function[33] The Algorithmic Nature Lab group has gathered statis-tics on almost 5 million short strings (maximum length is12 characters) produced by Turing machines with alphabetsranging from 2 to 9 symbols and based on these statisticsthe CTM can approximate the algorithmic probability of agiven string Detailed description of the CTM can be foundin [34] Since the function 119863(119899119898 119904) is an approximation ofthe true algorithmic probability 119901119886(119904) it can also be used toapproximate119870-complexity of the string 119904

The CTM can be applied only to short strings consistingof 12 characters or less For larger strings and matrices theBDM (Block Decomposition Method) should be used TheBDMrequires the decomposition of the string 119904 into (possiblyoverlapping) blocks 1198871 1198872 119887119896 Given a long string 119904 theBDM computes its algorithmic probability as

BDM (119904) = 119896sum119894=1

CTM (119887119894) + log2 10038161003816100381610038161198871198941003816100381610038161003816 (14)

where CTM(119887119894) is the algorithmic complexity of the block 119887119894and |119887119894| denotes the number of times the block 119887119894 appears in 119904Detailed description of the BDM can be found in [35]

Complexity 7

Betweenness0

749888167388167

159049062049062

199958513708514

214806998556999

215144660894661

221443362193362

221949855699856

224691919191919

225430375180375

239443001443001

240419191919192

242144660894661

246006132756133

254229076479077

254655483405483

255573232323232

278713564213564

299284992784993

443902597402597

468164502164502

Figure 6 Ouroboros network

Obviously any representation of a nontrivial networkrequires far more than 12 characters Consider once again the3-regular graph presented in Figure 1 The Laplacian matrixrepresentation of this graph is the following

11987110times10

=

[[[[[[[[[[[[[[[[[[[[[[[[[[

3 0 minus1 0 minus1 0 0 0 minus1 00 3 0 0 minus1 0 0 minus1 minus1 0minus1 0 3 minus1 0 minus1 0 0 0 00 0 minus1 3 0 minus1 minus1 0 0 0minus1 minus1 0 0 3 0 0 0 0 minus10 0 minus1 minus1 0 3 0 minus1 0 00 0 0 minus1 0 0 3 minus1 0 minus10 minus1 0 0 0 minus1 minus1 3 0 0minus1 minus1 0 0 0 0 0 0 3 minus10 0 0 0 minus1 0 minus1 0 minus1 3

]]]]]]]]]]]]]]]]]]]]]]]]]]

(15)

If we treat each row of the Laplacian matrix as a separateblock the string representation of the Laplacian matrixbecomes 119904 = 1198871 = 3010100010 1198872 = 0300100110 11988710 =0000101013 (for the sake of simplicity we have replacedthe symbol ldquominus1rdquo with the symbol ldquo1rdquo) This input can befed into the BDM producing the final estimation of thealgorithmic probability (and consequently the estimationof the 119870-complexity) of the string representation of theLaplacian matrix In our experiments whenever reportingthe values of 119870-complexity of the string 119904 we actually reportthe value of BDM(119904) as the approximation of the true 119870-complexity

4 Experiments

41 Gradual Change of Networks As we have stated beforethe aim of this research is not to propose a new complexitymeasure for networks but to compare the usefulness androbustness of entropy versus119870-complexity as the underlyingfoundations for complexity measures Let us recall whatproperties are expected from a good and reliable complex-ity measure for networks Firstly the measure should not

8 Complexity

depend on the particular network representation but shouldyield more or less consistent results for all possible losslessrepresentations of a network Secondly the measure shouldnot equate complexitywith randomnessThirdly themeasureshould take into consideration topological properties of a net-work and not be limited to simple counting of the number ofvertices and edges Of course statistical properties of a givennetwork will vary significantly between different networkinvariants but at the base level of network representationthe quantity used to define the complexity measure shouldfulfill the above requirements The main question that we areaiming to answer in this study is whether there are qualitativedifferences between entropy and119870-complexity with regard tothe above-mentioned requirements when measuring varioustypes of networks

In order to answer this question we have to measurehow a change in the underlying network structure affects theobserved values of entropy and119870-complexity To this end wehave devised two scenarios In the first scenario the networkgradually transforms from the perfectly ordered state to acompletely random state The second transformation bringsthe network from the perfectly ordered state to a state whichcan be understood as semiordered albeit in a different wayThe following sections present both scenarios in detail

411 From Watts-Strogatz Small-World Model to Erdos-RenyiRandom Network Model A small-world network modelintroduced byWatts and Strogatz [36] is based on the processwhich transforms a fully ordered network with no randomedge rewiring into a random network According to thesmall-world model vertices of the network are placed on aregular 119896-dimensional grid and each vertex is connected toexactly119898 of its nearest neighbors producing a regular latticeof vertices with equal degrees Then with a small probability119901 each edge is randomly rewired If 119901 = 0 no rewiringoccurs and the network is fully ordered All vertices have thesame degree the same betweenness and the entropy of theadjacencymatrix depends only on the density of edgesWhen119901 ge 0 edge rewiring is applied to edges and this processdistorts the degree distribution of vertices

On the other end of the network spectrum is theErdos-Renyi random network model [37] in which there isno inherent pattern of connectivity between vertices Therandom network emerges by selecting all possible pairsof vertices and creating for each pair an edge with theprobability 119901 Alternatively one can generate all possiblenetworks consisting of 119899 vertices and 119898 edges and thenrandomly pick one of these networksThe construction of therandom network implies the highest degree of randomnessand there is no otherway of describing a particular instance ofsuch network other than by explicitly providing its adjacencymatrix or the Laplacian matrix

In our first experiment we observe the behavior ofentropy and119870-complexity being applied to gradually chang-ing networks We begin with a regular small-world networkgenerated for 119901 = 0 Next we iteratively increase the valueof 119901 by 001 in each step until 119901 = 1 We retain thenetwork between iterations so conceptually it is one network

undergoing the transition Also we only consider rewiringof edges which have not been rewired during precedingiterations so every edge is rewired at most once For 119901 =0 the network forms a regular lattice of vertices and for119901 = 1 the network is fully random with all edges rewiredWhile randomly rewiring edges we do not impose anypreference on the selection of the target vertex of the edgebeing currently rewired that is each vertex has a uniformprobability of being selected as the target vertex of rewiring

412 From Watts-Strogatz Small-World Model to Barabasi-Albert Preferential Attachment Model Another popularmodel of artificial network generation has been introducedby Barabasi and Albert [38] This network model is basedon the phenomenon of preferential attachment according towhich vertices appear consecutively in the network and tendto join existing vertices with a strong preference for highdegree vertices The probability of selecting vertex V119894 as thetarget of a newly created edge is proportional to V119894rsquos degree119889(V119894) Scale-free networks have many interesting properties[39 40] but from our point of view the most interestingaspect of scale-free networks is the fact that they representa particular type of semiorder The behavior of low-degreevertices is chaotic and random and individual vertices aredifficult to distinguish but the structure of high-degreevertices (so-called hubs) imposes a well-defined topologyon the network High-degree vertices serve as bridgeswhich facilitate communication between remote parts of thenetwork and their degrees are highly predictable In otherwords although a vast majority of vertices behave randomlythe order appears as soon as high-degree vertices emerge inthe network

In our second experiment we start from a small-worldnetwork and we increment the edge rewiring probability119901 in each step This time however we do not select thenew target vertex randomly but we use the preferentialattachment principle In the early steps this process is stillrandom as the differences in vertex degrees are relativelysmall but at a certain point the scale-free structure emergesand as more rewiring occurs (for 119901 rarr 1) the network startsorganizing around a subset of high-degree hubsThe intuitionis that a good measure of network complexity should be ableto distinguish between the initial phase of increasing therandomness of the network and the second phase where thesemiorder appears

42 Results and Discussion We experiment only on arti-ficially generated networks using three popular networkmodels Erdos-Renyi randomnetworkmodelWatts-Strogatzsmall-world network model and Barabasi-Albert scale-freenetwork model We have purposefully left out empiricalnetworks from consideration due to a possible bias whichmight have been introduced Unfortunately for empiricalnetworks we do not have a good method of approximatingthe algorithmic probability of a network All we could dois to compare empirical distributions of network properties(such as degree betweenness and local clustering coefficient)with distributions from known generative models In our

Complexity 9

previouswork [41] we have shown that this approach can leadto severe approximation errors as distributions of networkproperties strongly depend on values of model parameters(such as edge rewiring probability in the small-world modelor power-law coefficient in the scale-free model) Without auniversal method of estimating the algorithmic probabilityof empirical networks it is pointless to compare entropyand 119870-complexity of such networks since no baseline canbe established and the results would not yield themselves tointerpretation

In our experiments we have used the acssR package [42]which implements the CodingTheoremMethod [34 43] andthe Block Decomposition Method [35]

Let us now present the results of the first experimentIn this experiment the edge rewiring probability 119901 changesfrom 0 to 1 by 001 in each iteration In each iteration wegenerate 50 instances of the network consisting of 119873 =100 vertices and for each generated network instance wecompute the following measures

(i) Entropy and119870-complexity of the adjacency matrix(ii) Entropy and119870-complexity of the Laplacian matrix(iii) Entropy and119870-complexity of the degree list(iv) Entropy and119870-complexity of the degree distribution

We repeat the experiments described in Section 41 foreach of the 50 networks performing the gradual changeof each of these networks and for each value of the edgerewiring probability 119901 we average the results over all 50networks Since entropy and 119870-complexity are expressed indifferent units we normalize bothmeasures to allow for side-by-side comparison The normalization procedure works asfollows For a given string of characters 119904 with the length119897 = |119904| we generate two strings The first string 119904min consistsof 119897 repeated 0rsquos and it represents the least complex string ofthe length 119897 The second string 119904max is a concatenation of 119897uniformly selected digits and it represents the most complexstring of the length 119897 Each value of entropy and119870-complexityis normalized with respect to minimum and maximum valueof entropy and 119870-complexity possible for a string of equallength This allows us not only to compare entropy and 119870-complexity between different representations of networksbut also to compare entropy to 119870-complexity directly Theresults of our experiments are presented in Figure 7

We observe that traditional entropy of the adjacencymatrix remains constant This is obvious the rewiring ofedges does not change the density of the network (thenumber of edges in the original small-world network andthe final random network or scale-free network is exactlythe same) so entropy of the adjacency matrix is the samefor each value of the edge rewiring probability 119901 On theother hand 119870-complexity of the adjacency matrix slowlyincreases It should be noted that the change of119870-complexityis small when analyzed in absolute values Nevertheless 119870-complexity consistently increases as networks diverge fromthe order of the small-world model toward the chaos ofrandomnetworkmodel A very similar result can be observedfor networks represented using Laplacian matrices Againentropy fails to signal any change in networkrsquos complexity

because the density of networks remains constant throughoutthe transition and the very slight change of entropy for 119901 isin⟨0 025⟩ is caused by the change of the degree list whichforms the main diagonal of the Laplacian matrix The resultfor the degree list is more surprising 119870-complexity of thedegree list slightly increases as networks lose their orderingbut remains close to 04 At the same time entropy increasesquickly as the edge rewiring probability 119901 approaches 1The pattern of entropy growth is very similar for both thetransition to random network and the transition to scale-freenetwork with the latter characterized counterintuitively bylarger entropy In addition the absolute value of entropy forthe degree list is several times larger than for the remainingnetwork representations (the adjacencymatrix and the Lapla-cian matrix) Finally both entropy and119870-complexity behavesimilarly for networks described using degree distributionsWe note that both measures correctly identify the decreaseof apparent complexity as networks approach the scale-free model (when semiorder emerges) and signal increasingcomplexity as networks becomemore andmore random It istempting to conclude from the results of the last experimentthat the degree distribution is the best representation whennetwork complexity is concerned However one should notforget that the degree distribution and the degree list arenot lossless representations of networks so the algorithmiccomplexity of degree distribution only estimates how difficultit is to recreate that distribution and not the entire network

Given the requirements formulated at the beginning ofthis section and the results of the experimental evaluationwe conclude that119870-complexity is a more feasible measure forconstructing intuitive complexity definitions 119870-complexitycaptures small topological changes in the evolving networkswhere entropy cannot detect these changes due to the factthat network density remains constant Also 119870-complexityproduces less variance in absolute values across differentnetwork representations and entropy returns drasticallydifferent estimates depending on the particular networkrepresentation

5 Conclusions

Entropy has been commonly used as the basis for modelingthe complexity of networks In this paper we show whyentropy may be a wrong choice for measuring networkcomplexity Entropy equates complexity with randomnessand requires preselecting the network feature of interest Aswe have shown it is relatively easy to construct a simplenetwork which maximizes entropy of the adjacency matrixthe degree sequence or the betweenness distribution Onthe other hand 119870-complexity equates the complexity withthe length of the computational description of the networkThis measure is much harder to deceive and it provides amore robust and reliable description of the network Whennetworks gradually transform from the highly ordered tohighly disordered states 119870-complexity captures this transi-tion at least with respect to adjacencymatrices and Laplacianmatrices

10 Complexity

Watts-Strogatz to Watts-Strogatz to

VariableEntropy of adjacency matrixK-complexity of adjacency matrix

006

007

008

009

Valu

e

025

050

075

100

000

025

050

075

100

000

p

Barabaacutesi-Albert Erdoumls-Reacutenyi

(a)


Entropy of Laplacian matrixK-complexity of Laplacian matrix

025

050

075

100

000

025

050

075

100

000

p

010

012

014

016

018

Valu

e

Variable


(b)


04

05

06

07

Valu

e

025

050

075

100

000

025

050

075

100

000

p

VariableEntropy of degree listK-complexity of degree list


(c)


Entropy of degree distributionK-complexity of degree distribution

Variable

025

050

075

100

000

025

050

075

100

000

p

05

06

07

Valu

e


(d)

Figure 7 Entropy and119870-complexity of (a) adjacency matrix (b) Laplacian matrix (c) degree list and (d) degree distribution under gradualtransition fromWatts-Strogatz model to Erdos-Renyi and Barabasi-Albert models

In this paper we have used traditional methods todescribe a network the adjacency matrix the Laplacianmatrix the degree list and the degree distribution We havelimited the scope of experiments to three most populargenerative network models random networks small-worldnetworks and scale-free networks However it is possible todescribe networks more succinctly using universal networkgenerators In the near future we plan to present a newmethod of computing algorithmic complexity of networks

without having to estimate 119870-complexity but rather fol-lowing the minimum description length principle Alsoextending the experiments to the realmof empirical networkscould prove to be informative and interestingWe also intendto investigate network representations based on variousenergies (Randic energy Laplacian energy and adjacencymatrix energy) and to research the relationships betweennetwork energy and 119870-complexity

Complexity 11

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

The authors would like to thank Adrian Szymczak for helpingto devise the degree sequence network This work wassupported by the National Science Centre Poland Decisionsnos DEC-201623BST603962 DEC-201621BST601463and DEC-201621DST602948 European Unionrsquos Horizon2020 Research and Innovation Program under the MarieSkłodowska-Curie Grant Agreement no 691152 (RENOIR)and the Polish Ministry of Science and Higher EducationFund for supporting internationally cofinanced projects in2016ndash2019 (Agreement no 3628H202020162)

References

[1] L Todd Veldhuizen ldquoSoftware libraries and their reuseEntropy kolmogorov complexity and zipf rsquos lawrdquo Library-Centric Software Design (LCSDrsquo05) p 11 2005

[2] D Bonchev and G A Buck ldquoQuantitative measures of networkcomplexityrdquo in Complexity in chemistry biology and ecologyMath Comput Chem pp 191ndash235 Springer New York 2005

[3] J Cardoso J Mendling G Neumann and H A ReijersldquoA discourse on complexity of process modelsrdquo in BusinessProcess Management Workshops vol 4103 of Lecture Notesin Computer Science pp 117ndash128 Springer Berlin HeidelbergBerlin Heidelberg 2006

[4] J Cardoso ldquoComplexity analysis of BPEL Web processesrdquoSoftware Process Improvement and Practice vol 12 no 1 pp 35ndash49 2007

[5] A Latva-Koivisto 2001 Finding a complexity measure forbusiness process models

[6] GMConstantine ldquoGraph complexity and the Laplacianmatrixin blocked experimentsrdquo Linear andMultilinear Algebra vol 28no 1-2 pp 49ndash56 1990

[7] D L Neel andM EOrrison ldquoThe linear complexity of a graphrdquoElectronic Journal of Combinatorics vol 13 no 1 Research Paper9 19 pages 2006

[8] J Kim and T Wilhelm ldquoWhat is a complex graphrdquo PhysicaA Statistical Mechanics and its Applications vol 387 no 11 pp2637ndash2652 2008

[9] M Dehmer F Emmert-Streib Z Chen X Li and Y Shi JohnWiley amp Sons Mathematical foundations and applications ofgraph entropy 2016

[10] M Dehmer and A Mowshowitz ldquoA history of graph entropymeasuresrdquo Information Sciences An International Journal vol181 no 1 pp 57ndash78 2011

[11] A Mowshowitz and M Dehmer ldquoEntropy and the complexityof graphs revisitedrdquo Entropy An International and Interdisci-plinary Journal of Entropy and Information Studies vol 14 no3 pp 559ndash570 2012

[12] S Cao M Dehmer and Y Shi ldquoExtremality of degree-basedgraph entropiesrdquo Information Sciences An International Journalvol 278 pp 22ndash33 2014

[13] Z Chen M Dehmer and Y Shi ldquoA note on distance-basedgraph entropiesrdquo Entropy An International and Interdisciplinary

Journal of Entropy and Information Studies vol 16 no 10 pp5416ndash5427 2014

[14] K C Das and S Sorgun ldquoOn randic energy of graphsrdquoMATCHCommunMath Comput Chem vol 72 no 1 pp 227ndash238 2014

[15] I Gutman and B Zhou ldquoLaplacian energy of a graphrdquo LinearAlgebra and its Applications vol 414 no 1 pp 29ndash37 2006

[16] C E Shannon The Mathematical Theory of CommunicationThe University of Illinois Press Urbana Ill USA 1949

[17] A N Kolmogorov ldquoThree approaches to the quantitativedefinition of informationrdquo International Journal of ComputerMathematics Section A Programming Theory and MethodsSection B Computational Methods vol 2 pp 157ndash168 1968

[18] A Renyi et al ldquoOn measures of entropy and informationrdquoin Proceedings of the 4th Berkeley Symposium on MathematicalStatistics and Probability pp 547ndash561 University of CaliforniaPress 1961

[19] A Mowshowitz ldquoEntropy and the complexity of graphs I Anindex of the relative complexity of a graphrdquo The Bulletin ofMathematical Biophysics vol 30 no 1 pp 175ndash204 1968

[20] I Ji W Bing-Hong W Wen-Xu and Z Tao ldquoNetwork entropybased on topology configuration and its computation to ran-dom networksrdquo Chinese Physics Letters vol 25 no 11 p 41772008

[21] M Dehmer ldquoInformation processing in complex networksgraph entropy and information functionalsrdquo Applied Mathe-matics and Computation vol 201 no 1-2 pp 82ndash94 2008

[22] S Cao M Dehmer and Z Kang ldquoNetwork entropies basedon independent sets and matchingsrdquo Applied Mathematics andComputation vol 307 pp 265ndash270 2017

[23] J Korner ldquoFredman-komlos bounds and information theoryrdquoSIAM Journal on Algebraic Discrete Methods vol 7 no 4 pp560ndash570 1986

[24] H Zenil N A Kiani and J Tegner ldquoLow-algorithmic-complexity entropy-deceiving graphsrdquo Physical Review E vol96 no 1 2017

[25] N Sloane ldquoThe on-line encyclopedia of integer sequencesrdquohttpsoeisorgA033308

[26] C Faloutsos and V Megalooikonomou ldquoOn data miningcompression and Kolmogorov complexityrdquo Data Mining andKnowledge Discovery vol 15 no 1 pp 3ndash20 2007

[27] J Schmidhuber ldquoDiscovering neural nets with lowKolmogorovcomplexity and high generalization capabilityrdquo Neural Net-works vol 10 no 5 pp 857ndash873 1997

[28] A Kulkarni and S Bush ldquoDetecting distributed denial-of-service attacks using Kolmogorov Complexity metricsrdquo Journalof Network and Systems Management vol 14 no 1 pp 69ndash802006

[29] M Li and P M B Vitanyii ldquoKolmogorov complexity andits applicationsrdquo in Algorithms and Complexity Texts andMonographs in Computer Science pp 1ndash187 Springer-VerlagNew York 2014

[30] G J Chaitin ldquoOn the length of programs for computing finitebinary sequencesrdquo Journal of the Association for ComputingMachinery vol 13 pp 547ndash569 1966

[31] L A Levin ldquoLaws on the conservation (zero increase) of infor-mation and questions on the foundations of probability theoryrdquoAkademiya Nauk SSSR Institut Problem Peredachi InformatsiiAkademii Nauk SSSR Problemy Peredachi Informatsii vol 10no 3 pp 30ndash35 1974

[32] R J Solomonoff ldquoA formal theory of inductive inference PartIrdquo Information and Control vol 7 no 1 pp 1ndash22 1964

12 Complexity

[33] T Rado ldquoOn non-computable functionsrdquo The Bell SystemTechnical Journal vol 41 pp 877ndash884 1962

[34] F Soler-Toscano H Zenil J-P Delahaye and N Gauvrit ldquoCal-culating Kolmogorov complexity from the output frequencydistributions of small turing machinesrdquo PLoS ONE vol 9 no5 Article ID e96223 2014

[35] H Zenil F Soler-Toscano N A Kiani S Hernandez-Orozcoand A Rueda-Toicen A decomposition method for global eval-uation of shannon entropy and local estimations of algorithmiccomplexity 2016

[36] D J Watts and S H Strogatz ldquoCollective dynamics of ldquosmall-worldrdquo networksrdquoNature vol 393 no 6684 pp 440ndash442 1998

[37] P Erdos and A Renyi ldquoOn the evolution of random graphsrdquoPublications of the Mathematical Institute of the HungarianAcademy of Sciences vol 5 pp 17ndash61 1960

[38] A-L Barabasi and R Albert ldquoEmergence of scaling in randomnetworksrdquoAmericanAssociation for theAdvancement of Sciencevol 286 no 5439 pp 509ndash512 1999

[39] A-L Barabasi ldquoScale-Free Networks a Decade and beyondrdquoAmerican Association for the Advancement of Science Sciencevol 325 no 5939 pp 412-413 2009

[40] M E Newman ldquoThe structure and function of complexnetworksrdquo SIAM Review vol 45 no 2 pp 167ndash256 2003

[41] T Kajdanowicz andMMorzy ldquoUsing Kolmogorov ComplexitywithGraph andVertex Entropy toMeasure Similarity of Empir-ical Graphs with Theoretical Graph Modelsrdquo in Proceedings ofthe 2nd International Electronic Conference on Entropy and ItsApplications p C003 Sciforumnet

[42] N Gauvrit H Singmann F Soler-Toscano and H ZenilldquoAlgorithmic complexity for psychology a user-friendly imple-mentation of the coding theorem methodrdquo Behavior ResearchMethods vol 48 no 1 pp 314ndash329 2016

[43] J-P Delahaye and H Zenil ldquoNumerical evaluation of algorith-mic complexity for short strings A glance into the innermoststructure of randomnessrdquo Applied Mathematics and Computa-tion vol 219 no 1 pp 63ndash77 2012

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of


Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of


Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of


Mathematical PhysicsAdvances in

Complex AnalysisJournal of


OptimizationJournal of


CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of


Operations ResearchAdvances in

Journal of


Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences


The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014


Algebra

Discrete Dynamics in Nature and Society



Decision SciencesAdvances in

Journal of


Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

2 Complexity

or unexpectedness As has been pointed out [8] withinthe spectrum of possible networks from the most ordered(cliques paths and stars) to the most disordered (randomnetworks) complex networks occupy the very center ofthis spectrum Finally a good complexity measure shouldnot depend on a particular network representation andshould yield consistent results for various representationsof the same network (adjacency matrix Laplacian matrixand degree sequence) Unfortunately as current researchsuggests finding a good complexity measure applicable to awide variety of networks is very challenging [9ndash11]

Among many possible measures which can be used todefine the complexity of networks the entropy of variousnetwork invariants has been by far the most popular choiceNetwork invariants considered for defining entropy-basedcomplexity measures include number of vertices number ofneighbors number of neighbors at a given distance [12] dis-tance between vertices [13] energy of network matrices suchas Randic matrix [14] or Laplacian matrix [15] and degreesequences There are multiple definitions of entropies usu-ally broadly categorized into three families thermodynamicentropies statistical entropies and information-theoreticentropies In the field of computer science information-theoretic measures are the most prevalent and they includeShannon entropy [16] Kolmogorov-Sinai entropy [17] andRenyi entropy [18] These entropies are based on the conceptof the information content of a system and they mea-sure the amount of information required to transmit thedescription of an object The underlying assumption ofusing information-theoretic definitions of entropy is thatuncertainty (as measured by entropy) is a nondecreasingfunction of the amount of available information In otherwords systems in which little information is available arecharacterized by low entropy and therefore are consideredto be ldquosimplerdquo The first idea to use entropy to quantify thecomplexity of networks comes fromMowshowitz [19]

Despite the ubiquitousness of general-purpose entropydefinitions many researchers have developed specializedentropy definitions aimed at describing the structure ofnetworks [10] Notable examples of such definitions includethe proposal by Ji et al to measure the unexpectedness of aparticular network by comparing it to the number of possiblenetwork configurations available for a given set of parameters[20] This concept is clearly inspired by algorithmic entropywhich defines the complexity of a system not in terms of itsinformation content but in terms of its generative processA different approach to measure the entropy of networks hasbeen introduced by Dehmer under the form of informationfunctional [21] Information functional can be also used toquantify network entropy in terms of 119896-neighborhoods ofvertices [12 13] or independent sets of vertices [22] Yetanother approach to network entropy has been proposed byKorner who advocates the use of stable sets of vertices as thebasis to compute network entropy [23] Several comprehen-sive surveys of network entropy applications are also available[9 11]

Within the realm of information science the complexityof a system is most often associated with the number of

possible interactions between elements of the system Com-plex systems evolve over time they are sensitive to evenminor perturbations at the initial steps of development andoften involve nontrivial relationships between constituentelements Systems exhibiting high degree of interconnect-edness in their structure andor behavior are commonlythought to be difficult to describe and predict and as aconsequence such systems are considered to be ldquocomplexrdquoAnother possible interpretation of the term ldquocomplexrdquo relatesto the size of the system In the case of networks one mightconsider to use the number of vertices and edges to estimatethe complexity of a network However the size of the networkis not a good indicator of its complexity because networkswhich have well-defined structures and behaviors are ingeneral computationally simple

In this work we do not introduce a new complex-ity measure or propose new informational functional andnetwork invariants on which an entropy-based complexitymeasure could be defined Rather we follow the observationsformulated in [24] andwe present the criticism of the entropyas the guiding principle of complexity measure construc-tion Thus we do not use any specific formal definitionof complexity but we provide additional arguments whyentropy may be easily deceived when trying to evaluatethe complexity of a network Our main hypothesis is thatalgorithmic entropy also known as Kolmogorov complexityis superior to traditional Shannon entropy due to the factthat algorithmic entropy is more robust less dependent onthe network representation and better aligned with intuitivehuman understanding of complexity

The organization of the paper is the following In Sec-tion 2 we introduce basic definitions related to entropy andwe formulate arguments against the use of entropy as thecomplexity measure of networks Section 23 presents severalexamples of entropy-deceiving networks which provide bothmotivation and anecdotal evidence for our hypothesis InSection 3 we introduce Kolmogorov complexity andwe showhow this measure can be applied to networks despite itshigh computational cost The results of the experimentalcomparison of entropy and Kolmogorov complexity arepresented in Section 4The paper concludes in Section 5 witha brief summary and future work agenda

2 Entropy as the Measure ofNetwork Complexity

21 Basic Definitions Let us introduce basic definitions andnotation used throughout the remainder of this paper Anetwork is an ordered pair119866 = ⟨119881 119864⟩ where119881 = V1 V119873is the set of vertices and 119864 = (V119894 V119895) isin 119881 times 119881 is the setof edges The degree 119889(V119894) of the vertex V119894 is the number ofvertices adjacent to it 119889(V119894) = |V119895 (V119894 V119895) isin 119864| A givennetwork can be represented inmany ways for instance usingan adjacency matrix defined as

119860119873times119873 [119894 119895] = 1 if (V119894 V119895) isin 1198640 otherwise (1)

Complexity 3


119871119873times119873 [119894 119895] =


(2)


119875 (119889119894) = 119901 (119889 (V119895) = 119889119894) =10038161003816100381610038161003816V119895 isin 119881 119889 (V119895) = 11988911989410038161003816100381610038161003816119873 (3)


119867(119883) = minus 119899sum119894=1

119901 (119909119894) log119887 119901 (119909119894) (4)


lim119899rarrinfin

119867(119883) = minusint119901 (119909) 119901 (119909)119898 (119909) 119889119909 (5)


11986010times10 =

[[[[[[[[[[[[[[[[[[[[[[[

0 0 1 0 1 0 0 0 1 00 0 0 0 1 0 0 1 1 01 0 0 1 0 1 0 0 0 00 0 1 0 0 1 1 0 0 01 1 0 0 0 0 0 0 0 10 0 1 1 0 0 0 1 0 00 0 0 1 0 0 0 1 0 10 1 0 0 0 1 1 0 0 01 1 0 0 0 0 0 0 0 10 0 0 0 1 0 1 0 1 0

]]]]]]]]]]]]]]]]]]]]]]]

(6)

1

2

3

4

5

6

7

89

10






4 Complexity







1198723times3 = [[[0 1 01 0 10 0 1

]]] (7)


119867119897 (119883)= minus sum1199091isin119883


119901 (1199091 119909119897) log2 119901 (1199091 119909119897) (8)






end forend for

Complexity 5

Degree123

456

789

101112

131415

161718

19




Degree101112

131415

161718





6 Complexity





119870119879 (119904) = min |119875| 119879 (119875) = 119904 (9)


119901119886 (119904) = sum119875119879(119875)=119904

12|119875| (10)





119870119879 (119904) asymp minuslog2 119901119886 (119904) (12)



(13)



BDM (119904) = 119896sum119894=1

CTM (119887119894) + log2 10038161003816100381610038161198871198941003816100381610038161003816 (14)


Complexity 7

Betweenness0

749888167388167

159049062049062

199958513708514

214806998556999

215144660894661

221443362193362

221949855699856

224691919191919

225430375180375

239443001443001

240419191919192

242144660894661

246006132756133

254229076479077

254655483405483

255573232323232

278713564213564

299284992784993

443902597402597

468164502164502



11987110times10

=

[[[[[[[[[[[[[[[[[[[[[[[[[[


]]]]]]]]]]]]]]]]]]]]]]]]]]

(15)


4 Experiments


8 Complexity










Complexity 9









5 Conclusions


10 Complexity



006

007

008

009

Valu

e

025

050

075

100

000

025

050

075

100

000

p


(a)



025

050

075

100

000

025

050

075

100

000

p

010

012

014

016

018

Valu

e

Variable


(b)


04

05

06

07

Valu

e

025

050

075

100

000

025

050

075

100

000

p



(c)



Variable

025

050

075

100

000

025

050

075

100

000

p

05

06

07

Valu

e


(d)




Complexity 11



Acknowledgments


References


































12 Complexity



















Volume 2014




Journal of











Journal of


Function Spaces






Algebra





Journal of




Complexity 3


119871119873times119873 [119894 119895] =


(2)


119875 (119889119894) = 119901 (119889 (V119895) = 119889119894) =10038161003816100381610038161003816V119895 isin 119881 119889 (V119895) = 11988911989410038161003816100381610038161003816119873 (3)


119867(119883) = minus 119899sum119894=1

119901 (119909119894) log119887 119901 (119909119894) (4)


lim119899rarrinfin

119867(119883) = minusint119901 (119909) 119901 (119909)119898 (119909) 119889119909 (5)


11986010times10 =

[[[[[[[[[[[[[[[[[[[[[[[

0 0 1 0 1 0 0 0 1 00 0 0 0 1 0 0 1 1 01 0 0 1 0 1 0 0 0 00 0 1 0 0 1 1 0 0 01 1 0 0 0 0 0 0 0 10 0 1 1 0 0 0 1 0 00 0 0 1 0 0 0 1 0 10 1 0 0 0 1 1 0 0 01 1 0 0 0 0 0 0 0 10 0 0 0 1 0 1 0 1 0

]]]]]]]]]]]]]]]]]]]]]]]

(6)

1

2

3

4

5

6

7

89

10






4 Complexity







1198723times3 = [[[0 1 01 0 10 0 1

]]] (7)


119867119897 (119883)= minus sum1199091isin119883


119901 (1199091 119909119897) log2 119901 (1199091 119909119897) (8)






end forend for

Complexity 5

Degree123

456

789

101112

131415

161718

19




Degree101112

131415

161718





6 Complexity





119870119879 (119904) = min |119875| 119879 (119875) = 119904 (9)


119901119886 (119904) = sum119875119879(119875)=119904

12|119875| (10)





119870119879 (119904) asymp minuslog2 119901119886 (119904) (12)



(13)



BDM (119904) = 119896sum119894=1

CTM (119887119894) + log2 10038161003816100381610038161198871198941003816100381610038161003816 (14)


Complexity 7

Betweenness0

749888167388167

159049062049062

199958513708514

214806998556999

215144660894661

221443362193362

221949855699856

224691919191919

225430375180375

239443001443001

240419191919192

242144660894661

246006132756133

254229076479077

254655483405483

255573232323232

278713564213564

299284992784993

443902597402597

468164502164502



11987110times10

=

[[[[[[[[[[[[[[[[[[[[[[[[[[


]]]]]]]]]]]]]]]]]]]]]]]]]]

(15)


4 Experiments


8 Complexity










Complexity 9









5 Conclusions


10 Complexity



006

007

008

009

Valu

e

025

050

075

100

000

025

050

075

100

000

p


(a)



025

050

075

100

000

025

050

075

100

000

p

010

012

014

016

018

Valu

e

Variable


(b)


04

05

06

07

Valu

e

025

050

075

100

000

025

050

075

100

000

p



(c)



Variable

025

050

075

100

000

025

050

075

100

000

p

05

06

07

Valu

e


(d)




Complexity 11



Acknowledgments


References


































12 Complexity



















Volume 2014




Journal of











Journal of


Function Spaces






Algebra





Journal of




4 Complexity







1198723times3 = [[[0 1 01 0 10 0 1

]]] (7)


119867119897 (119883)= minus sum1199091isin119883


119901 (1199091 119909119897) log2 119901 (1199091 119909119897) (8)






end forend for

Complexity 5

Degree123

456

789

101112

131415

161718

19




Degree101112

131415

161718





6 Complexity





119870119879 (119904) = min |119875| 119879 (119875) = 119904 (9)


119901119886 (119904) = sum119875119879(119875)=119904

12|119875| (10)





119870119879 (119904) asymp minuslog2 119901119886 (119904) (12)



(13)



BDM (119904) = 119896sum119894=1

CTM (119887119894) + log2 10038161003816100381610038161198871198941003816100381610038161003816 (14)


Complexity 7

Betweenness0

749888167388167

159049062049062

199958513708514

214806998556999

215144660894661

221443362193362

221949855699856

224691919191919

225430375180375

239443001443001

240419191919192

242144660894661

246006132756133

254229076479077

254655483405483

255573232323232

278713564213564

299284992784993

443902597402597

468164502164502



11987110times10

=

[[[[[[[[[[[[[[[[[[[[[[[[[[


]]]]]]]]]]]]]]]]]]]]]]]]]]

(15)


4 Experiments


8 Complexity










Complexity 9









5 Conclusions


10 Complexity



006

007

008

009

Valu

e

025

050

075

100

000

025

050

075

100

000

p


(a)



025

050

075

100

000

025

050

075

100

000

p

010

012

014

016

018

Valu

e

Variable


(b)


04

05

06

07

Valu

e

025

050

075

100

000

025

050

075

100

000

p



(c)



Variable

025

050

075

100

000

025

050

075

100

000

p

05

06

07

Valu

e


(d)




Complexity 11



Acknowledgments


References


































12 Complexity



















Volume 2014




Journal of











Journal of


Function Spaces






Algebra





Journal of




Complexity 5

Degree123

456

789

101112

131415

161718

19




Degree101112

131415

161718





6 Complexity





119870119879 (119904) = min |119875| 119879 (119875) = 119904 (9)


119901119886 (119904) = sum119875119879(119875)=119904

12|119875| (10)





119870119879 (119904) asymp minuslog2 119901119886 (119904) (12)



(13)



BDM (119904) = 119896sum119894=1

CTM (119887119894) + log2 10038161003816100381610038161198871198941003816100381610038161003816 (14)


Complexity 7

Betweenness0

749888167388167

159049062049062

199958513708514

214806998556999

215144660894661

221443362193362

221949855699856

224691919191919

225430375180375

239443001443001

240419191919192

242144660894661

246006132756133

254229076479077

254655483405483

255573232323232

278713564213564

299284992784993

443902597402597

468164502164502



11987110times10

=

[[[[[[[[[[[[[[[[[[[[[[[[[[


]]]]]]]]]]]]]]]]]]]]]]]]]]

(15)


4 Experiments


8 Complexity










Complexity 9









5 Conclusions


10 Complexity



006

007

008

009

Valu

e

025

050

075

100

000

025

050

075

100

000

p


(a)



025

050

075

100

000

025

050

075

100

000

p

010

012

014

016

018

Valu

e

Variable


(b)


04

05

06

07

Valu

e

025

050

075

100

000

025

050

075

100

000

p



(c)



Variable

025

050

075

100

000

025

050

075

100

000

p

05

06

07

Valu

e


(d)




Complexity 11



Acknowledgments


References


































12 Complexity



















Volume 2014




Journal of











Journal of


Function Spaces






Algebra





Journal of




6 Complexity





119870119879 (119904) = min |119875| 119879 (119875) = 119904 (9)


119901119886 (119904) = sum119875119879(119875)=119904

12|119875| (10)





119870119879 (119904) asymp minuslog2 119901119886 (119904) (12)



(13)



BDM (119904) = 119896sum119894=1

CTM (119887119894) + log2 10038161003816100381610038161198871198941003816100381610038161003816 (14)


Complexity 7

Betweenness0

749888167388167

159049062049062

199958513708514

214806998556999

215144660894661

221443362193362

221949855699856

224691919191919

225430375180375

239443001443001

240419191919192

242144660894661

246006132756133

254229076479077

254655483405483

255573232323232

278713564213564

299284992784993

443902597402597

468164502164502



11987110times10

=

[[[[[[[[[[[[[[[[[[[[[[[[[[


]]]]]]]]]]]]]]]]]]]]]]]]]]

(15)


4 Experiments


8 Complexity










Complexity 9









5 Conclusions


10 Complexity



006

007

008

009

Valu

e

025

050

075

100

000

025

050

075

100

000

p


(a)



025

050

075

100

000

025

050

075

100

000

p

010

012

014

016

018

Valu

e

Variable


(b)


04

05

06

07

Valu

e

025

050

075

100

000

025

050

075

100

000

p



(c)



Variable

025

050

075

100

000

025

050

075

100

000

p

05

06

07

Valu

e


(d)




Complexity 11



Acknowledgments


References


































12 Complexity



















Volume 2014




Journal of











Journal of


Function Spaces






Algebra





Journal of




Complexity 7

Betweenness0

749888167388167

159049062049062

199958513708514

214806998556999

215144660894661

221443362193362

221949855699856

224691919191919

225430375180375

239443001443001

240419191919192

242144660894661

246006132756133

254229076479077

254655483405483

255573232323232

278713564213564

299284992784993

443902597402597

468164502164502



11987110times10

=

[[[[[[[[[[[[[[[[[[[[[[[[[[


]]]]]]]]]]]]]]]]]]]]]]]]]]

(15)


4 Experiments


8 Complexity










Complexity 9









5 Conclusions


10 Complexity



006

007

008

009

Valu

e

025

050

075

100

000

025

050

075

100

000

p


(a)



025

050

075

100

000

025

050

075

100

000

p

010

012

014

016

018

Valu

e

Variable


(b)


04

05

06

07

Valu

e

025

050

075

100

000

025

050

075

100

000

p



(c)



Variable

025

050

075

100

000

025

050

075

100

000

p

05

06

07

Valu

e


(d)




Complexity 11



Acknowledgments


References


































12 Complexity



















Volume 2014




Journal of











Journal of


Function Spaces






Algebra





Journal of




8 Complexity










Complexity 9









5 Conclusions


10 Complexity



006

007

008

009

Valu

e

025

050

075

100

000

025

050

075

100

000

p


(a)



025

050

075

100

000

025

050

075

100

000

p

010

012

014

016

018

Valu

e

Variable


(b)


04

05

06

07

Valu

e

025

050

075

100

000

025

050

075

100

000

p



(c)



Variable

025

050

075

100

000

025

050

075

100

000

p

05

06

07

Valu

e


(d)




Complexity 11



Acknowledgments


References


































12 Complexity



















Volume 2014




Journal of











Journal of


Function Spaces






Algebra





Journal of




Complexity 9









5 Conclusions


10 Complexity



006

007

008

009

Valu

e

025

050

075

100

000

025

050

075

100

000

p


(a)



025

050

075

100

000

025

050

075

100

000

p

010

012

014

016

018

Valu

e

Variable


(b)


04

05

06

07

Valu

e

025

050

075

100

000

025

050

075

100

000

p



(c)



Variable

025

050

075

100

000

025

050

075

100

000

p

05

06

07

Valu

e


(d)




Complexity 11



Acknowledgments


References


































12 Complexity



















Volume 2014




Journal of











Journal of


Function Spaces






Algebra





Journal of




10 Complexity



006

007

008

009

Valu

e

025

050

075

100

000

025

050

075

100

000

p


(a)



025

050

075

100

000

025

050

075

100

000

p

010

012

014

016

018

Valu

e

Variable


(b)


04

05

06

07

Valu

e

025

050

075

100

000

025

050

075

100

000

p



(c)



Variable

025

050

075

100

000

025

050

075

100

000

p

05

06

07

Valu

e


(d)




Complexity 11



Acknowledgments


References


































12 Complexity



















Volume 2014




Journal of











Journal of


Function Spaces






Algebra





Journal of




Complexity 11



Acknowledgments


References


































12 Complexity



















Volume 2014




Journal of











Journal of


Function Spaces






Algebra





Journal of




12 Complexity



















Volume 2014




Journal of











Journal of


Function Spaces






Algebra





Journal of











Volume 2014




Journal of











Journal of


Function Spaces






Algebra





Journal of




On Measuring the Complexity of Networks: Kolmogorov Complexity...

Documents

Transcript of On Measuring the Complexity of Networks: Kolmogorov Complexity...