Identification of Overlapping Communities via Constrained ... · networks such as brain [1], trend...

16
5730 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 66, NO. 21, NOVEMBER 1, 2018 Identification of Overlapping Communities via Constrained Egonet Tensor Decomposition Fatemeh Sheikholeslami and Georgios B. Giannakis Abstract—Detection of overlapping communities in real-world networks is a generally challenging task. Upon recognizing that a network is in fact the union of its egonets, a novel network rep- resentation using multiway data structures is advocated in this contribution. The introduced sparse tensor-based representation exhibits richer structure compared to its matrix counterpart and, thus, enables a more robust approach to community detection. To leverage this structure, a constrained tensor approximation frame- work is introduced using PARAFAC decomposition. The aris- ing constrained trilinear optimization is handled via alternating minimization, where intermediate subproblems are solved using the alternating direction method of multipliers to ensure conver- gence. The factors obtained provide soft community memberships, which can further be exploited for crisp, and possibly-overlapping community assignments. The framework is further broadened to include time-varying graphs, where the edgeset as well as the un- derlying communities evolve through time. The performance of the proposed approach is assessed via tests on benchmark syn- thetic graphs as well as real-world networks. As corroborated by numerical tests, the proposed tensor-based representation captures multihop nodal connections, that is, connectivity patterns within single-hop neighbors, whose exploitation yields a more robust com- munity identification in the presence of mixing as well as overlap- ping communities. Index Terms—Community detection, overlapping communities, egonet subgraphs, tensor decomposition, constrained PARAFAC, sparse tensors. I. INTRODUCTION G RAPH representation of complex real-world networks provides an invaluable tool for analysis and discovery of intrinsic attributes present in social, biological, and financial networks. One such attribute is the presence of small subgraphs, referred to as “communities” or “clusters,” whose dense intra- connections and sparse inter-connections often represents a po- tential “association” among the participating entities (nodes). The task of community identification targets the discovery of such highly-interwoven nodes, and is of paramount interest in areas as diverse as unveiling functional modules in biological Manuscript received July 13, 2017; revised February 19, 2018, May 11, 2018, and September 10, 2018; accepted September 11, 2018. Date of publication September 24, 2018; date of current version October 1, 2018. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Pierre Borgnat. This work was supported by the National Science Foundation under Grants 1500713, 1442686, and 1711471. (Corresponding author: Georgios B. Giannakis.) The authors are with the Department of Electrical and Computer Engineering and Digital Technology Center, University of Minnesota, Minneapolis, MN 55455 USA (e-mail:, [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TSP.2018.2871383 networks such as brain [1], trend analysis in social media [2], [3], and clustering of costumers in recommender systems [4]. Past works on community detection include those based on modularity-maximization [5], [6], generative and statisti- cal models such as mixed membership stochastic block models (MMSB) [7]–[12], local-metric optimization[13], spectral clus- tering [14], and matrix factorization [15]–[19]; see e.g. [20], [21] for a comprehensive overview. With recent exploratoty studies over contemporary real-world networks, new challenges have been raised in community identification, addressing the presence of overlapping communities [22]–[24], multimodal interaction of nodes over multiview networks [25], [26], ex- ploitation of nodal and edge-related side-information [27], as well as dynamic interactions [28], [29]. In handling such new challenges, reliance on the adjacency matrix representation of networks limits their capabilities in capturing higher-order interactions, which can potentially pro- vide critical information via temporal, multi-modal, or even multi-hop connectivity among nodes. To this end, tensors as multi-way data structures provide a viable alternative, whose increased representational capabilities can potentially lead to a more informative community identification [25], [26], [28], [30], [31]. For instance, [32] and [33] construct higher-order tensors whose entries are non-zero if a tuple of nodes jointly belongs to a cycle or a clique, while [28] captures temporal dy- namics of communities via tensors. Under certain conditions, tensor decompositions are unique [34]–[36], and can guarantee identifiability of the community structure. The present work develops a novel tensor-based network rep- resentation by recognizing that a network is the union of its egonets. An egonet is defined per node as the subgraph in- duced by the node itself, its one-hope neighbors, and all their connections, whose structure has been exploited in anomaly de- tection [37], and user-specific community identification [38], [39]. By concatenating egonet adjacency matrices along the 3-rd dimension of a three-way tensor, the proposed network repre- sentation, named egonet-tensor, captures information per node beyond its one-hop connectivity patterns. In fact, in a number of practical networks only adjacency matrix of the network is given, rendering egonets a unique candidate for enhancing com- munity identification performance when extra nodal features are kept private, e.g., Amazon costumer graphs. By construction, egonet-tensor exhibits richer structure compared to its matrix counterpart, which is further exploited by casting the commu- nity detection task in a constrained tensor decomposition frame- work. Building on preliminary results in [31], solvers with con- vergence guarantees are developed for the proposed constrained non-convex optimization, whose solution yields the community- revealing components, utilized for soft as well as crisp commu- nity assignments unveiling possibly overlapping communities. 1053-587X © 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications standards/publications/rights/index.html for more information.

Transcript of Identification of Overlapping Communities via Constrained ... · networks such as brain [1], trend...

Page 1: Identification of Overlapping Communities via Constrained ... · networks such as brain [1], trend analysis in social media [2], [3], and clustering of costumers in recommender systems

5730 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 66, NO. 21, NOVEMBER 1, 2018

Identification of Overlapping Communities viaConstrained Egonet Tensor Decomposition

Fatemeh Sheikholeslami and Georgios B. Giannakis

Abstract—Detection of overlapping communities in real-worldnetworks is a generally challenging task. Upon recognizing that anetwork is in fact the union of its egonets, a novel network rep-resentation using multiway data structures is advocated in thiscontribution. The introduced sparse tensor-based representationexhibits richer structure compared to its matrix counterpart and,thus, enables a more robust approach to community detection. Toleverage this structure, a constrained tensor approximation frame-work is introduced using PARAFAC decomposition. The aris-ing constrained trilinear optimization is handled via alternatingminimization, where intermediate subproblems are solved usingthe alternating direction method of multipliers to ensure conver-gence. The factors obtained provide soft community memberships,which can further be exploited for crisp, and possibly-overlappingcommunity assignments. The framework is further broadened toinclude time-varying graphs, where the edgeset as well as the un-derlying communities evolve through time. The performance ofthe proposed approach is assessed via tests on benchmark syn-thetic graphs as well as real-world networks. As corroborated bynumerical tests, the proposed tensor-based representation capturesmultihop nodal connections, that is, connectivity patterns withinsingle-hop neighbors, whose exploitation yields a more robust com-munity identification in the presence of mixing as well as overlap-ping communities.

Index Terms—Community detection, overlapping communities,egonet subgraphs, tensor decomposition, constrained PARAFAC,sparse tensors.

I. INTRODUCTION

GRAPH representation of complex real-world networksprovides an invaluable tool for analysis and discovery

of intrinsic attributes present in social, biological, and financialnetworks. One such attribute is the presence of small subgraphs,referred to as “communities” or “clusters,” whose dense intra-connections and sparse inter-connections often represents a po-tential “association” among the participating entities (nodes).The task of community identification targets the discovery ofsuch highly-interwoven nodes, and is of paramount interest inareas as diverse as unveiling functional modules in biological

Manuscript received July 13, 2017; revised February 19, 2018, May 11, 2018,and September 10, 2018; accepted September 11, 2018. Date of publicationSeptember 24, 2018; date of current version October 1, 2018. The associateeditor coordinating the review of this manuscript and approving it for publicationwas Dr. Pierre Borgnat. This work was supported by the National ScienceFoundation under Grants 1500713, 1442686, and 1711471. (Correspondingauthor: Georgios B. Giannakis.)

The authors are with the Department of Electrical and Computer Engineeringand Digital Technology Center, University of Minnesota, Minneapolis, MN55455 USA (e-mail:,[email protected]; [email protected]).

Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TSP.2018.2871383

networks such as brain [1], trend analysis in social media [2],[3], and clustering of costumers in recommender systems [4].

Past works on community detection include those basedon modularity-maximization [5], [6], generative and statisti-cal models such as mixed membership stochastic block models(MMSB) [7]–[12], local-metric optimization[13], spectral clus-tering [14], and matrix factorization [15]–[19]; see e.g. [20],[21] for a comprehensive overview. With recent exploratotystudies over contemporary real-world networks, new challengeshave been raised in community identification, addressing thepresence of overlapping communities [22]–[24], multimodalinteraction of nodes over multiview networks [25], [26], ex-ploitation of nodal and edge-related side-information [27], aswell as dynamic interactions [28], [29].

In handling such new challenges, reliance on the adjacencymatrix representation of networks limits their capabilities incapturing higher-order interactions, which can potentially pro-vide critical information via temporal, multi-modal, or evenmulti-hop connectivity among nodes. To this end, tensors asmulti-way data structures provide a viable alternative, whoseincreased representational capabilities can potentially lead toa more informative community identification [25], [26], [28],[30], [31]. For instance, [32] and [33] construct higher-ordertensors whose entries are non-zero if a tuple of nodes jointlybelongs to a cycle or a clique, while [28] captures temporal dy-namics of communities via tensors. Under certain conditions,tensor decompositions are unique [34]–[36], and can guaranteeidentifiability of the community structure.

The present work develops a novel tensor-based network rep-resentation by recognizing that a network is the union of itsegonets. An egonet is defined per node as the subgraph in-duced by the node itself, its one-hope neighbors, and all theirconnections, whose structure has been exploited in anomaly de-tection [37], and user-specific community identification [38],[39]. By concatenating egonet adjacency matrices along the 3-rddimension of a three-way tensor, the proposed network repre-sentation, named egonet-tensor, captures information per nodebeyond its one-hop connectivity patterns. In fact, in a numberof practical networks only adjacency matrix of the network isgiven, rendering egonets a unique candidate for enhancing com-munity identification performance when extra nodal features arekept private, e.g., Amazon costumer graphs. By construction,egonet-tensor exhibits richer structure compared to its matrixcounterpart, which is further exploited by casting the commu-nity detection task in a constrained tensor decomposition frame-work. Building on preliminary results in [31], solvers with con-vergence guarantees are developed for the proposed constrainednon-convex optimization, whose solution yields the community-revealing components, utilized for soft as well as crisp commu-nity assignments unveiling possibly overlapping communities.

1053-587X © 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications standards/publications/rights/index.html for more information.

Page 2: Identification of Overlapping Communities via Constrained ... · networks such as brain [1], trend analysis in social media [2], [3], and clustering of costumers in recommender systems

SHEIKHOLESLAMI AND GIANNAKIS: IDENTIFICATION OF OVERLAPPING COMMUNITIES VIA CONSTRAINED EGONET TENSOR 5731

Fig. 1. Construction of egonet-tensor with frontal slabs as egonet adjacencies.

The rest of the paper is organized as follows. Section IIintroduces the novel tensor-based network representation, andSection III presents the constrained tensor decomposition, andits efficient solver for the task of community identification.Section IV provides analytical results corroborating theunderlying intuition. Performance metrics for evaluating thequality of detected communities in networks with and withoutground-truth communities is the subject of Section V, andSection VI introduces identification of time-varying com-munities using the proposed algorithm. Section VII providesnumerical tests, while Section VIII concludes the paper.

Notation: Lower- (upper-) case boldface letters denote col-umn vectors (matrices), and underlined upper-case boldface let-ters stand for tensor structures. Calligraphic symbols are re-served for sets, while T stands for transposition. Symbols ◦, ⊗and � are reserved for outer-product, Kronecker-product andKhatri-Rao-product, respectively, Tr{X} denotes the trace ofmatrix X, and xij denotes the (i, j)-th entry of matrix X.

II. EGONET-TENSOR CONSTRUCTION

Consider a graph G = (V, E ,W), where V , E , and W ∈RN×N respectively denote the set of N nodes, i.e., |V| = N ,edges, and the adjacency matrix. In the case of binary net-works, wij := 1 if (i, j) ∈ E , and wij := 0 otherwise. Fur-thermore, the egonet of node n is defined as the subgraph in-duced by node n, its single-hop neighbors, and all their edges.Let G(n) := (V, E (n) ,W(n)) ⊂ G be the subgraph with E (n)

the egonet edgeset, and W(n) ∈ RN×N the corresponding ad-jacency matrix whose non-zero support captures the edges inE (n) ; that is,

w(n)ij :=

{wij if (i, j) ∈ E (n)

0 otherwise.

where wij denotes the nonzero edge between nodes i andj, and E (n) := {(u, v)|∀(u, v) ∈ E , (n, v) ∈ E , (n, u) ∈ E}.Fig. 1(a) illustrates such subgraphs, where the black node cor-responds to the central node of the egonet, and the single-hopneighbors are colored green. Typically, the center node n isexcluded from G(n) , but it is included here for convenience.

As Fig. 1(b) depicts, graph G can now be fully described by athree-way egonet-tensor W ∈ RN×N×N , where frontal slabscorrespond to egonet adjacency matrices {W(n)}Nn=1 , stackedone after the other. In tensor parlance, that is tantamount tosetting the n-th frontal slab as W:,:,n := W(n) , where : is afree index that spans its range.

The advantage of presenting a graph with its egonet-tensorlies in the fact that tensors as higher-order structures are capableof capturing “useful information” along their different dimen-sions, a.k.a., “modes.” In a network with underlying community

Fig. 2. (a) A toy network with three fully connected non-overlapping com-munities; (b) corresponding egonet-tensor; (c) its community-revealing fac-torization via CPD; (d) overlapping communities; and (e) correspondingegonet-tensor.

structure, the proposed egonet-tensor representation is indeedcapable of preserving the inherent “similarities” among egonetadjacencies along the 3-rd mode, whose extraction is crucialin tasks such as community detection. Such representation isof particular interest for various settings where no nodal fea-tures are provided (that is, only the adjacency matrix is given),making egonets very appealing for an improved network repre-sentation as well as the development of robust schemes for thetask of interest. The following toy example clarifies how suchsimilarities induce a structure over the egonet tensor, and intu-itively discusses how its exploitation can lead to an improvedperformance.

A. Toy Example

Let us consider a toy network with three fully-connected com-munities, illustrated in Fig. 2(a). Since each node is a memberof a fully-connected community, the binary adjacency matrix ofits egonet is identical to that of any other node in its residentcommunity. Furthermore, after permutation, this egonet adja-cency matrix consists of a single block of nonzero entries (withzero diagonal entries if the network is free of self-loops), andzeros elsewhere. This implies that the egonet-tensor can be per-muted similarly into a block-diagonal tensor with three nonzeroblocks, as illustrated in Fig. 2(b).

Adopting the well-known canonical polyadic decomposition(CPD) to decompose W into its constituent rank-one tensors,the model naturally approximates the tensor with rank three,thus revealing the number of underlying communities. In fact, ifthe diagonal entries were all set to 1, i.e. considering self loopsfor all the neighboring nodes in an egonet, this approximationwould be exact; see Fig. 2(c).

In practice, real-world networks often demonstrate overlap-ping community structure, where some nodes are associated

Page 3: Identification of Overlapping Communities via Constrained ... · networks such as brain [1], trend analysis in social media [2], [3], and clustering of costumers in recommender systems

5732 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 66, NO. 21, NOVEMBER 1, 2018

Fig. 3. Visualization of an LFR network with N = 50, μ = 0.2, and five “shared” nodes {4, 18, 23, 39, 47}, represented by a larger radius, with (a) hardcommunity association via Infomap; and (b) soft association via EgoTen. Pie-charts depict association ratios.

with multiple communities rather than a single one. To ad-dress such cases, consider the augmented network in Fig. 2(d),where a new node (or a super node corresponding to more thanone node) associated with two communities is added. Oncethe corresponding egonet tensor is constructed, the presence ofoverlapping communities manifests itself in overlapping diag-onal blocks in the egonet tensor; see Fig. 2(e) in comparisonwith disjoint blocks in Fig. 2(b). As it will become evident inSection II, by exploiting a strutured CPD on the arising egonet-tensor, it can be shown that the frontal slab corresponding to theegonet adjacency of an overlapping node is approximated bymultiple summands, each corresponding to one of its residentcommunities. The egonet-tensor representation naturally tradesoff flexibility for increased redundancy and memory costs. Nev-ertheless, resulting tensor is extremely sparse, and off-the-shelftools for sparse tensor computations can be readily utilized; seee.g., [40]–[42].

Unfortunately, such idealistic assumptions where each com-munity is fully-connected within and well-separated from othercommunities, are not fulfilled in real-world networks. Neverthe-less, the inherent similarities among egonet adjacencies induce apotentially useful reinforced structure along the 3rd mode of theproposed egonet-tensor representation. For instance, nodes ina community often exhibit dense (rather than full) connectionsamong themselves, and fewer connections with other commu-nities. This property is consequently reflected in the egonet-tensor (as well as the traditional matrix adjacency) represen-tation by dense diagonal blocks, whose clear separation fadesaways as out-of-community connections increase. However, thepresence of overlapping communities can further smear theblock-structure as overlapping nodes are “well-connected” withmultiple communities. The performance of traditional commu-nity detection methods often dramatically degrades in networkswith such properties, whereas exploiting the structured redun-dancy offered via the proposed egonet-tensor representation andcasting the problem in a community-revealing tensor decompo-sition framework increases robustness against the aforemen-tioned phenomena; see Fig. 3.

III. CONSTRAINED EGONET TENSOR CPD

Given W, this section leverages the canonical polyadic de-composition (CPD) [40] in order to factorize the egonet tensorinto its constituent community-revealing factors. Assuming that

the number of communities is upperbounded by K, a rank-K CPD model is sought by solving the following constrainedleast-squares (LS) problem

{A, B, C} = arg minA ,B ,C

∥∥∥∥∥W −K∑

k=1

ak ◦ bk ◦ ck

∥∥∥∥∥2

F

s.t. A ≥ 0,B ≥ 0,C ≥ 0 (1)

where A := [a1 , . . . ,aK ] ∈ RN×K , B :=[b1 , . . . ,bK ]∈RN×K , and C := [c1 , . . . , cK ] ∈ RN×K ; while the term(ak ◦ bk ◦ ck ) is the outer product of the three vectors, whichinduces the k-th rank-one tensor component in the rank-Kdecomposition; see [40] for further details on CPD. The con-straint A ≥ 0 denotes entry-wise nonnegativity constraints, i.e.,ank ≥ 0 for n = 1, . . . , N and k = 1, . . . , K; and similarly forfactors B and C. These constraints enforce the nonnegativityof egonet adjacency matrices, thus inducing structure in thesought CPD and providing interpretation of the decompositionfactors.

It is possible to re-write (1) as, see e.g., [40]

{A, B, C} = arg minA ,B ,C

N∑n=1

‖W(n) −Adiag(cn )B‖2F

s.t. A ≥ 0,B ≥ 0,C ≥ 0

where diag(cn ) is a diagonal matrix holding the n-th row of Con its diagonal. Focusing on the n-th frontal slab of the egonet-tensor, CPD provides the approximation

W(n) =K∑

k=1

cnk (akbk ) (2)

where cnk denotes the (n, k)-th entry of factor C. Such de-composition can be interpreted as a weighted sum over K“basis”, {akbk }Kk=1 , where (akbk ) captures the “connectiv-ity structure” within the k-th community. Consequently, cnk

can be viewed as association level of node n to community kfor k = 1, . . . ,K. Furthermore, one can easily realize that since(akbk ) is viewed as connectivity structure of community k, theelements in ak := [a1k , . . . , ank ] and bk := [b1k , . . . , bnk ]can be consequently viewed as the contribution levels ofnodes n = 1, . . . , N to community k. This interpretation of the

Page 4: Identification of Overlapping Communities via Constrained ... · networks such as brain [1], trend analysis in social media [2], [3], and clustering of costumers in recommender systems

SHEIKHOLESLAMI AND GIANNAKIS: IDENTIFICATION OF OVERLAPPING COMMUNITIES VIA CONSTRAINED EGONET TENSOR 5733

factors prompts us to further leverage the structure and introduceadditional constraints on the CPD factors.

A. Structured CPD

Real-world networks often involve nodes which are asso-ciated with more than one community, resulting in multiplenonzero entries in the association vector [cn1 , cn2 , . . . , cnK ]corresponding to a generic node n. To return a normalized as-sociation vector, we augment the optimization in (1) by theconstraints

∑Kk=1 cnk = 1 for n = 1, 2, . . . , N , which together

with C ≥ 0 acts as a simplex constraint on the rows of C. Uponimposing simplex constraints (1) is further regularized as

{A, B, C} = arg minA ,B ,C

{‖W −

K∑k=1

ak ◦ bk ◦ ck‖2F

+ λ(‖A‖2F + ‖B‖2F )

}

s.t. A ≥ 0,B ≥ 0,C ≥ 0

‖cn‖1 = 1 ∀n = 1, 2, . . . , N (3)

Different from [43] and [44], the regularization term ‖A‖2F +‖B‖2F does not play the role of rank-regularization for subspacelearning, instead it solves the scalar ambiguity in factors Aand B.

Note that the proposed algorithm requires an upperbound onthe number of communities K to solve for the optimization in(3). In principle, one can readily regularize the objective func-tion with a complexity term similar to that used by (Bayesian)Akaike’s information-theoretic, see e.g. [45], or, minimum de-scription length criteria [46]. At the expense of complexity, suchaugmented criteria can also return estimates of the number ofcommunities.

The CPD problem formulated in (3) is a tri-linear constrainedLS problem, whose minimization can be tackled by alternatingoptimization. In the ensuing subsection, the proposed solveris developed using by alternating optimization with ADMMintermediate steps; see e.g. [47] and [48].

B. Solving the Egonet Structured CPD

In the proposed alternating optimization scheme, each stepconsists of fixing two factors and minimizing the arising sub-problem with respect to the third factor. In this subsection, westudy the emerging sub-problems and propose efficient solversfor tackling those.

1) Factor A Update: Consider first the update of factor A atiteration k, obtained after fixing B = B(k−1) and C = C(k−1)

and solving the corresponding minimization. The arising sub-problem, after algebric manipulation can be re-written as

A(k) = arg minA≥0

‖W1 − (B(k−1) �C(k−1))A‖2F + λ‖A‖2F(4)

where W1 := [vec(W1,:,:), . . . , vec(WN,:,:)] ∈ RN 2×N is amatricized reshaping of the tensor W. Also B(k−1)�C(k−1)

:= [b(k−1)1 ⊗ c(k−1)

1 , . . . ,b(k−1)K ⊗ c(k−1)

K ] is the Khatri-Rao

product of B(k−1) and C(k−1) , where b(k−1)i (c(k−1)

i ) denotescolumn i of B(k−1) (resp. C(k−1)), and⊗ denotes the Kroneckerproduct operator; see also [40].

Following the steps in [48], auxiliary variable A is introducedto account for the nonnegativity constraint, and the augmentedLagrangian of (4) is

L(k)A (A, A, Y ) = ‖W1 −H(k)

A A‖2F + λTr{AA}+ r+(A) + (ρ/2)‖Y + A− A‖2F (5)

where A,Y ∈ RN×K , H(k)A := C(k−1) �B(k−1) , and r+(A)

is the regularizer corresponding to the nonnegativity constraint,

r+(A) :={

0 if A ≥ 0+∞ o.w.

The ADMM solver then proceeds by iteratively updatingblocks of variables A, A,Y as⎧⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎩

A(r) = arg minA L(k)A (A, A(r−1) ,Y(r−1))

A(r) = P+(Y(r−1) + A(r))

Y(r) = Y(r−1) − ρ(A(r) − A(r))r = r + 1

(6)

until a convergence criterion is met, namely whether the max-imum number of iterations is exceeded, i.e., r = Imax,ADMM,or a prescribed ε-accuracy is met, i.e., ‖A(r) −A(r−1)‖F /‖A(r−1)‖F < ε. Operator P+(.) in (6) denotes the element-wise projection of the input matrix onto the positive orthant,and its use enables the A(r) update to be carried at a very lowcost. The Lagrange multiplier is set to ρ = ‖HA‖2F /K- a valuethat is empirically shown to yield similar performance to thatof the optimal value [48]. The final A(r) iterate in the ADMMsolver will be used to update A(k) .

2) Factor B update: Update of factor B can be similarlycarried out by solving the subproblem

B(k) = arg minB≥0

‖W2 −H(k)B B‖2F + λ‖B‖2F (7)

where W2 := [vec(W:,1,:), . . . , vec(W:,N ,:)], and H(k)B :=

C(k−1) �A(k) , yielding a similar optimization problem as in(4). Algorithm 2 tabulates the explicit update rules for solving(4) and similarly (7) using a general framework.

3) Factor C Update: Update of factor C is obtained byfixing A and B at their most recent values, and solving thesubproblem

C(k) = arg minC‖W3 − (A(k) �B(k))C‖2F

s.t., C ≥ 0 ‖cn‖1 = 1 ∀n = 1, . . . , N (8)

where W3 := [vec(W:,:,1), . . . , vec(W:,:,N )]. Utilizing anADMM approach, the augmented Lagrangian is formed as

L(k)C (C, C, Y ) = ‖W3 −H(k)

C C‖2F + rsimp(C)

+ (ρ/2)‖Y + C− C‖2F (9)

where C,Y ∈ RN×K , H(k)C := (A(k) �B(k)), and rsimp(C)

is the regularizer corresponding to the simplex constraint on therows of matrix C as

rsimp(C) :={

0 if C ≥ 0,∑K

k=1 cn,k = 1 ∀n+∞ o.w.

Page 5: Identification of Overlapping Communities via Constrained ... · networks such as brain [1], trend analysis in social media [2], [3], and clustering of costumers in recommender systems

5734 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 66, NO. 21, NOVEMBER 1, 2018

Algorithm 1: Constrained Tensor Decomposition via Alter-nating Least-Squares (ALS).

Input W,K, Imax , λInitialize A,B,C ∈ RN×K at random and set k = 0Form Matrix reshapes W1 ,W2 ,W3 of the tensor asW1 := [vec(W1,:,:), . . . , vec(WN,:,:)]W2 := [vec(W:,1,:), . . . , vec(W:,N ,:)]W3 := [vec(W:,:,1), . . . , vec(W:,:,N )]while k < Imax do or not-convergenced

H(k)A = C(k−1) �B(k−1)

A(k) ← Algorithm 2 with input {H(k)A ,W1 ,A(k−1)}

H(k)B = C(k−1) �A(k)

B(k) ← Algorithm 2 with input {H(k)B ,W2 ,B(k−1)}

H(k)C = B(k) �A(k)

C(k) ← Algorithm 3 with input {H(k)C ,W3 ,C(k−1)}

k ← k + 1end whileRetrun A(k) ,B(k) ,C(k)

Algorithm 2: ADMM Solver for 1st and 2nd Mode Sub-problems.

Input H,W,ZinitGoal is to solve

Z∗ = arg minZ≥0

Tr{Z(HH + λIK×K )Z − 2WHZ

}

Set ρ = ‖Z init‖2FK ,Z(0) = Zinit, Z(0) = 0N×K ,Y(0) =

0N×K , r = 0while r < Imax,ADMM do

Z(r) = (HH + (λ + ρ/2)IK×K )−1

×(WH +

ρ

2(Y(r−1) − Z(r−1))

)

Z(r) = P+

(Z(r) + Y(r−1)

)Y(r) = Y(r−1) − ρ(Z(r) − Z(r))r = r + 1

end whileRetrun Z(r)

ADMM solver then proceeds with iterative updates as⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩

C(r) = arg minC L(k)C (C, C(r−1) ,Y(r−1))

C(r) = Psimp(Y(r−1) + C(r))

Y(r) = Y(r−1) − ρ(C(r) − C(r))r = r + 1

(10)

wherePsimp(.) denotes the projection of rows of the input matrixonto the simplex set. This projection has been widely studiedand can be efficiently accommodated by the algorithm discussedin [49]. Explicit update steps of the ADMM solver for (8) aretabulated under Algorithm 3.

Proposition 1: If the sequence generated by Algorithm 4 isbounded, then the sequence {A(k) ,B(k) ,C(k)} converges to astationary point of (3).

Algorithm 3: ADMM Solver for 3rd Mode Subproblem.Input H,W,ZinitGoal is to solve

Z∗ = arg minZ≥0‖zn ‖1 =1 ∀n=1,...,N

Tr{ZHHZ − 2WHZ

}

Set ρ = ‖Z init‖2FK ,Z(0) = Zinit, Z(0) = 0N×K ,Y(0) =

0N×K , r = 0while r < Imax,ADMM do

Z(r) = (HH + ρ/2 IN×N )−1

×(WH +

ρ

2(Y(r−1) − Z(r−1))

)

Z(r) = Psimp

(Z(r) + Y(r−1)

)Y(r) = Y(r−1) − ρ(Z(r) − Z(r))r = r + 1

end whileRetrun Z(r)

Proof: The convergence follows from [48, Theorem 1]; alsoc.f. [50], [51]. �

Proposition 1 provides convergence guarantees on the se-quence {A(k) ,B(k) ,C(k)}, where the necessary assumption ison boundedness of the generated iterates. This assumption iseasily guaranteed here due to Frobenius regularization on fac-tors A and B, as well as the simplex constraint on factor C;thus, the results readily carry over.

IV. LOW-RANK EGONET TENSOR ANALYSIS

In this section, the intuitive low-rank property of the egonetadjacency matrix and tensor are justified analytically. To thisend, and similar to the analysis of spectral clustering methods incommunity detection, consider modeling a network of N nodesand two communities by a stochastic blockmodel (SBM) [52],[53]. That is, consider an undirected network with two com-munities of equal size N/2, assuming N is even for simplicity,where the intra-community edge probability is p and the inter-community edge probability is q, denoted as SBM(N ; 2; p; q),with p > q. Define also the expected egonet-adjacency of com-munity Ck as

Wk := E

[1|Ck |

∑vi ∈Ck

W(i)

], k = 1, 2

W(i) , as in Section II, denotes the egonet-adjacency matrixof node vi , and the expectation is taken with respect to therandomness of the graph.

Proposition 2: In an SBM(N ; 2; p; q) network with commu-nities of equal size and q < p , the expected egonet-adjacencymatrix W1 of community C1 can be approximated by a rank-2

matrix W1 , where the approximation error is bounded as

‖W1 − W1‖2 ≤ p3 +4N

p +√

2q.

Furthermore, for networks with large N and p >√

2q, se-lecting the N/2 largest-amplitude entries of the eigenvector

Page 6: Identification of Overlapping Communities via Constrained ... · networks such as brain [1], trend analysis in social media [2], [3], and clustering of costumers in recommender systems

SHEIKHOLESLAMI AND GIANNAKIS: IDENTIFICATION OF OVERLAPPING COMMUNITIES VIA CONSTRAINED EGONET TENSOR 5735

corresponding to the largest eigenvalue unravels the membersof community C1 .

Proof: See Appendix A for detailed proof. �Building on this and upon proper permutation, similar result

can be provided for the expected egonet-adjacency matrix W2of community C2 . That is, W2 in an SBM(N ; 2; p; q) networkwith communities of equal size can be approximated by a rank-2

matrix W2 , and the approximation error is bounded as

‖W2 − W2‖2 ≤ p3 +4N

p +√

2q.

As the exact number of nodes in a community is unknown inpractice, hard community detection is performed by threshold-ing the association coefficients. This is common in non-negativematrix and tensor factorization approaches to community detec-tion [9], [14], [18], [24], [27], and has been empirically demon-strated to yield high-quality results. For this reason, it has beenadopted here too.

Definition: In an SBM(N ; 2; p; q) network, the expectedegonet-adjacency tensor W of size N ×N × 2 is defined asW(:,:,1) := W1 and W(:,:,2) := W2 .

Proposition 3: Tensor W in an SBM(N ; 2; p; q) networkwith equally-sized communities can be approximated by a rank-3 tensor, where the approximation error is upper-bounded as

minW ,rank(W )=3

1N‖W − W‖F

≤√

(p3 + 4p/N)2 + p2q4

N+

q(1− pq)N

(11)

which diminishes with the network size as O(1/√

N). Further-more, such approximation can be achieved by

WN×N×2 = βv0 ◦ v0 ◦[

11

]+ (α− β)v1 ◦ v1 ◦

[10

]

+ (α− β)v2 ◦ v2 ◦[

01

](12)

where v0 = 1N×1 , v1 = [11×N/2 ,01×N/2 ], and v2 =[01×N/2 ,11×N/2 ] and ◦ denotes the outer product.

Proof: See Appendix B for proof. �Remark: The first term in decomposition (12) captures the

inter-community edges, while the other two terms correspond tointra-community connections, and v1 and v2 reflect communitystructures. The following Corollary studies the asymptotic casewhere the probability of inter-community edges q approaches 0.

Corollary 2: The expected adjacency tensor of an SBM(N ; 2; p; q) network approaches a rank-2 tensor as q→0; that is

limq→0

1N

∥∥∥W − p3v1 ◦ v1 ◦[

10

]+ p3v2 ◦ v2 ◦

[01

]∥∥∥F

= 0

and the approximation error for sufficiency large N can beapproximated by

minW ,rank(W )=2

1N‖W − W‖F �

√p3

N.

Proof: The proof readily follows from the fact that the ap-proximation term corresponsing to v0 in (12) has Frobenius-norm

√2βN and its normalized (by N ) Frobenius-norm goes to

zero as q → 0. Furthermore, the right-hand-side of the inequal-ity in Prop. 3 for sufficiently large N and q→0 is dominated bythe first term, and thus it can be approximated by

√p3/N . �

So far, in Propositions 2 and 3 we have justified the intu-ition behind the rank-revealing decomposition of the expectedegonet-adjacency tensor W. In practice however, we never haveaccess to W1 and W2 . Instead, the egonet adjacency tensor inSection II provides N/2 samples of the egonet adjacency foreach of the two communities. Thus, one is motivated to approx-imate the expected egonet-tensor W with its sample mean.

Specifically, were the community membership of the verticesprovided, one could approximate W1 and W2 as

W1 � W1 :=2N

∑vi ∈ C1

W(i) , W2 � W2 :=2N

∑vi ∈ C2

W(i)

and define the N ×N × 2 sample mean egonet-adjacency ten-sor W with two slabs W(:,:,1) = W1 and W(:,:,2) = W2 .

Without community memberships available, we only haveaccess to the egonet adjacency W. The following propositionproves that the CPD of W (without the oracle knowledge) pro-vides an upperbound on the CPD of W.

Proposition 4: The rank-r CPD objective ofWN×N×2 is up-per bounded by the rank-r CPD objective of WN×N×N ; that is,

arg min{A ,B ,C}≥0

12

2∑k=1

‖Wk −Adiag(ck )B‖2F

≤ arg min{A ,B ,C}≥0

1N

N∑n=1

‖W(n) −Adiag(cn )B‖2F

where A, B, C := [c1 , . . . , cN ] are of size N × r, and C :=[c1 , c2 ] is of size 2× r.

Proof: For detailed proof, see Appendix C. �According to Proposition 4, rank-r CPD of W provides an

upper-bound on the rank-r CPD of the oracle-given W. Theanalysis can be generalized to networks with K communities,and further regularization and simplex constraints can help bet-ter utilize the intrinsic properties of the community detectiontask as well as resolve scalar ambiguities among CPD factors.The merits of our novel tensor decomposition with respect tothe conventional adjacency matrix factorization will be corrobo-rated by test results on synthetic as well as real-world networks.To this end, common metrics for quality evaluation of the de-tected communities are discussed next.

V. COMMUNITY ASSIGNMENT AND EVALUATION

Once the proposed solver returns the solution of (3), the rowsof factor C provide a “soft” or “fuzzy” community membershipfor the nodes in the network. In the special case of networkswith non-overlapping communities, using the n-th row of matrixC, node n will be assigned to the community k∗ where k∗ =arg maxk=1,...,K cnk .

In order to provide a “crisp” community association in net-works with overlapping communities, where a node can be as-sociated with more than one community, the entries cnk arecompared with a threshold τ and node n is associated withcommunity k if cnk > τ . Thus, crisp community membership

Page 7: Identification of Overlapping Communities via Constrained ... · networks such as brain [1], trend analysis in social media [2], [3], and clustering of costumers in recommender systems

5736 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 66, NO. 21, NOVEMBER 1, 2018

matrix Γ ∈ RN×K is obtained as

[Γ]nk :={

1 if cnk > τ

0 o.w.∀n, k (13)

There are a number of metrics available for evaluating thequality of a detected cover, that is, a set of communities, innetworks with underlying community structure. Depending onwhether the ground-truth communities are available or not, twocategories of metrics are considered.

A. Networks With Ground-Truth Communities

Normalized mutual information and F1-score are the mostcommonly-used metrics for performance evaluation over net-works with ground truth communities. Let cover S := {C1 , . . . ,C|S|} denote the set of detected communities, where Ci is the

set of nodes associated with community i for i = 1, 2, . . . , |S|,and let the ground truth communities be denoted by S∗ :={C∗1 , . . . , C∗|S∗|}.

Normalized mutual information (NMI) [20]: NMI is aninformation-theoretic metric defined as (cf. [20])

NMI(S∗, S) :=2I(S∗, S)

H(S∗) + H(S)

where H(S) denotes the entropy of set S defined as

H(S) := −|S|∑i=1

p(Ci) log p(Ci) = −|S|∑i=1

|Ci |N

log|Ci |N

and similarly for H(S∗). Furthermore, I(S∗, S) denotes the mu-tual information between the detected and ground-truth com-munities, and is defined as

I(S∗, S) :=|S∗|∑i=1

|S|∑j=1

p(C∗i ∩ C∗j ) logp(C∗i ∩ Cj )p(C∗i )p(Cj )

(14)

=|S∗|∑i=1

|S|∑j=1

|C∗i ∩ Cj |N

logN |C∗i ∩ Cj ||C∗i ||Cj |

(15)

Intuitively, mutual information I(S∗, S) reflects a measure ofsimilarity between the two community sets, while entropy H(S)(H(S∗)) denotes the level of uncertainty in community affilia-tion of a random node in cover S (resp. S∗). Thus, high valuesof NMI, namely its maximum at 1, reflect predictability of Sbased on S∗ which readily translates into correct communityidentification in the detected cover S, whereas low values ofNMI, namely its minimum at 0, reflects poor discovery of thetrue underlying communities. This measure has been general-ized for overlapping communities in [54], and will be utilizedfor performance assessment in such networks.

Average F1-score [9]: F1-score is a measure of binary classi-fication accuracy, specifically, the harmonic mean of precisionand recall, taking its highest value at 1 and lowest value at 0.To obtain the average F1-score for S, one needs to find whichdetected community Ci ∈ S corresponds to a given true com-munity C∗j ∈ S∗, i.e., maximizes the corresponding F1-score.

The average F1-score is then given by

F1 :=1|S∗|

|S∗|∑i=1

F1(C∗i , CI (i))

where

I(i) = arg maxj

F1(C∗i , Cj )

in which F1(C∗i , Cj ) := 2 |C∗i ∩Cj ||C∗i |+ |Cj |

.

B. Networks without ground-truth communities

A general metric for evaluating the “quality” of detected com-munities, regardless of whether the ground-truth membershipsare available or not, is to measure conductance [55].

Conductance: Conductance of a detected community Ck ingraph G is defined as

φ(Ck ) :=

∑i ∈ Ck ,j /∈Ck Wij

min{vol(Ck ), vol(V \ Ck )}where

vol(Ck ) :=∑

i ∈ Ck ,∀jWij

and (V \ Ck ) is the complement of Ck . According to this mea-sure, high-quality communities yield small conductance scoresas they exhibit dense connections among the nodes within thecommunity and sparse connections with the rest.

Furthermore, the weighted-average φ(S) is defined as theaverage conductance of the detected communities weighted bytheir (normalized) community size, that is,

φ(S) :=|S|∑

k=1

|Ck |N

φ(Ck ). (16)

VI. COMMUNITY DETECTION ON TIME-VARYING GRAPHS

In this section, we extend the introduced overlapping com-munity identification approach over networks for which theconnectivity evolves over time [56]. For instance, considerthe emergence of a new sports club giving rise to a newcommunity of individuals whose newly-formed interactionsin the club will be reflected in their connections over the so-cial media, or, the network of brain regions where the activa-tion/deactivation of different regions during a certain task can becaptured by a time-varying graph. The goal is to utilize the pro-posed EgoTen approach for identification of dynamic commu-nities, as well as the corresponding time-varying association ofnodes.

To this end, consider graph Gt := (V, Et ,Wt), where thesubscript t denotes time index t = 1, . . . , T . The set of nodes Vis assumed fixed across time, while the edgeset Et as well as thecorresponding adjacency matrix Wt are allowed to vary. Theintroduced EgoTen community identification algorithm can bereadily applied to time-varying graphs as follows.

For any slot t, the procedure of egonet tensor construc-tion can be carried out, giving rise to a 3-dimensional egonet-tensor denoted by Wt . Subsequently, the overall 4-dimensionalegonet-tensor is constructed by stacking Wt ∀t along the fourth

Page 8: Identification of Overlapping Communities via Constrained ... · networks such as brain [1], trend analysis in social media [2], [3], and clustering of costumers in recommender systems

SHEIKHOLESLAMI AND GIANNAKIS: IDENTIFICATION OF OVERLAPPING COMMUNITIES VIA CONSTRAINED EGONET TENSOR 5737

dimension of W ∈ RN×N×N×T ; that is, according to the ten-sor parlance we have W:,:,:,t = Wt for t = 1, . . . , T . Havingformed the overall egonet-tensor W, dynamic community de-tection is now cast as

{A, B, C, D} = arg minA ,B ,C ,D

{‖W −

K∑k=1

ak ◦ bk ◦ ck ◦ dk‖2F

+ λ(‖A‖2F + ‖B‖2F )

}

s.t. A ≥ 0,B ≥ 0,C ≥ 0,D ≥ 0

‖cn‖1 = 1 ∀n = 1, 2, . . . , N

‖dn‖1 = 1 ∀n = 1, 2, . . . , N (17)

where A,B,C ∈ RN×K , and D := [d1 , . . . , dT ] ∈ RT ×K .The LS cost in (17) is the generalization of that for the 3-dimensional tensor decomposition to a higher dimension, whilenonnegativity and simplex constrains are similarly carried overfor D. To clarify the simplex constraints on the rows of D,consider the decomposition

Wt =K∑

k=1

dtk

(ak ◦ bk ◦ ck

). (18)

For a fixed slot t, the rank-one tensors of (ak ◦ bk ◦ ck ) canbe perceived as building blocks of the 3-dimensional egonet-tensors, and the “degree of presence” of community k at time tis captured by dkt , while the constraint ‖dt‖1 = 1 ∀t resolvesthe scalar ambiguity by normalization. Indeed, stationary graphswith T = 1 can be subsumed by this model, for which theadditional constraints on D reduce to the trivial solution forthe fourth factor as D = 11×K . Focusing on the n-th slab ofWt , the egonet adjacency of node n at time t is decomposed as

W(n)t =

K∑k=1

dtk cnk

(ak ◦ bk

)(19)

where the product dtk cnk provides the association of node n tocommunity k at time t.

Similar to (3), the decomposition in (17) is solved by alter-nating least-squares for updating the factors. The overall solveris provided in Algorithm 4, within which we have utilized Algo-rithm 2 and Algorithm 3 for handling the emerging subproblems.

Note the proposed decomposition is in fact unraveling theevolution of detected communities across time via columns offactor D. That is, rather than explicit discovery of temporalcommunity association of node n, the k-th column of factor Dis utilized to modulate the association of node n to communityk across time, with dtk ∗ cnk for time t. Thus, association of dif-ferent nodes will be modulated in the same way at time t, whilethe final product is influenced by both dtk and cnk . Furthermore,one could jointly utilize factors A, B, and C for discoveringnode-community association, where similar constraints such assimplex should be imposed on (at least one of) the other twofactors. For faster convergence however, we have not imposedsuch additional structure, and have solely utilized factor C forthis purpose. Thus, joint utilization of the three factors remainsan open direction.

Algorithm 4: Constrained ALS for Time-Varying Graphs.Input W,K, Imax , λInitialize A,B,C ∈ RN×K and D ∈ RT ×K and set k = 0Form matrix reshapes W1 ,W2 ,W3 ,W4 of the tensor asW1 := [vec(W1,:,:,:), . . . , vec(WN,:,:,:)]W2 := [vec(W:,1,:,:), . . . , vec(W:,N ,:,:)]W3 := [vec(W:,:,1,:), . . . , vec(W:,:,N ,:)]W4 := [vec(W:,:,:,1), . . . , vec(W:,:,:,T )]while k < Imax do or not-convergenced

H(k)A = D(k−1) �C(k−1) �B(k−1)

A(k) ← Algorithm 2 with input {H(k)A ,W1 ,A(k−1)}

H(k)B = D(k−1) �C(k−1) �A(k)

B(k) ← Algorithm 2 with input {H(k)B ,W2 ,B(k−1)}

H(k)C = D(k−1) �B(k) �A(k)

C(k) ← Algorithm 3 with input {H(k)C ,W3 ,C(k−1)}

H(k)D = C(k) �B(k) �A(k)

D(k) ← Algorithm 3 with input {H(k)D ,W4 ,D(k−1)}

k ← k + 1end whileRetrun A(k) ,B(k) ,C(k) ,D(k)

VII. SIMULATED TESTS

In this section, the performance of the proposed EgoTencommunity detection algorithm is assessed via benchmark syn-thetic networks, as well as real-world datasets. Experiments overLancicchinetti-Fortunatoand-Radicci (LFR) synthetic bench-marks enables us to simulate networks with different levelsof community mixing, as well as number of overlapping nodes.This proves helpful in highlighting the enhanced capacity ofcommunity detection achieved by exploiting the “higher-order”properties of vertices captured in the proposed egonet-tensor.

A. Benchmark Networks

LFR benchmark networks provide synthetic graphs withground-truth communities, in which certain properties of real-world networks, namely power-law distribution for nodal de-grees as well as community sizes are preserved. LFR networksare configured to have a total number of N nodes (vertices),nodal average degree d, exponent of degree distribution γ1 , andexponent of community-size distribution γ2 . Furthermore, mix-ing parameter μ controls the community cohesion, where largerμ induces more connections among nodes in different commu-nities, thus generating less cohesive communities. Moreover,parameters {on , om} set the number of overlapping nodes, thatis, nodes belonging to more than one community, and the num-ber of communities with which these nodes are associated, re-spectively. We have compared the performance of the proposedEgoTen with (similarly-constrained) nonnegative matrix factor-ization (NMF) schemes over the adjacency matrix, as well asother state-of-the-art community detection schemes.

1) Matrix Versus Tensor Factorization: The experimentshere focus on comparison between EgoTen and its matrix coun-terpart. In particular, we will demonstrate how the higher-order connectivity patterns as well as structured redundancyoffered through EgoTen can increase the robustness of com-munity detection in the case of overlapping, and highly-mixedcommunities.

Page 9: Identification of Overlapping Communities via Constrained ... · networks such as brain [1], trend analysis in social media [2], [3], and clustering of costumers in recommender systems

5738 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 66, NO. 21, NOVEMBER 1, 2018

Fig. 4. Performance of constrained NMF and EgoTen in terms of (a) NMI;and (b) average F1-score, versus μ for LFR networks of N = 1,000, withon = 200.

To this end, the corresponding NMF approach aims at factor-izing matrix W by solving

minU ,V‖W −UV‖2F

s.t. ‖un‖1 = 1∀n = 1, . . . , N, U ≥ 0 ,V ≥ 0 (20)

where U,V ∈ RN×K , and U := [u1 ,u2 , . . . ,uN ]. Thus,similar to the nonnegative tensor decomposition in (3), the n-throw of matrix U contains community association coefficients ofnode n, and is subject to a simplex constraint. This minimizationis solved via the AO-ADMM toolbox in [48]. Hard communityassignments resulting from the factor U can be achieved similarto the procedure over the C factor in EgoTen, as discussed inSection IV.

In our experiments, we have generated LFR networks withN = 1,000, γ1 = 2, γ2 = 1, average degree d = 100, and om ={2, 3, 5}. To demonstrate the robustness of EgoTen, Fig. 4 showsthe performance of EgoTen in comparison with constrainedNMF over the adjacency matrix versus different values of μ,in terms of the NMI, and the F1-score. The experiments havebeen carried out for on = 200 (20%) number of overlappingnodes, upperbound on the number of communities is set asK = 3|C|, and τ is chosen so that the highest NMI is achievedfor both methods. Furthermore, Fig. 5 depicts the performanceof NMF and EgoTen across different levels of overlap, con-trolled through the number of overlapping nodes on , while fixingμ = 0.2. In addition, since a common practical concern is thelack of knowledge over the number of communities, Fig. 6 stud-ies the robustness of NMF and EgoTen to this factor, by settingK = κ|C|, and varying κ ∈ [1.2, 5]. As the plots in Figs. 4–6corroborate, capturing the higher-order connectivity patterns

Fig. 5. Performance of constrained NMF and EgoTen in terms of (a) NMI;and (b) average F1-score, versus on for LFR networks with N = 1,000, andμ = 0.2.

Fig. 6. Performance of constrained NMF and EgoTen in terms of (a) NMI;and, (b) average F1-score, versus upperbound on community number K = κ|C|parametrized by κ for LFR networks with N = 1, 000, on = 300, and μ = 0.2.

Page 10: Identification of Overlapping Communities via Constrained ... · networks such as brain [1], trend analysis in social media [2], [3], and clustering of costumers in recommender systems

SHEIKHOLESLAMI AND GIANNAKIS: IDENTIFICATION OF OVERLAPPING COMMUNITIES VIA CONSTRAINED EGONET TENSOR 5739

Fig. 7. Performance of EgoTen and state-of-the-art algorithms in terms of(a) NMI; and (b) average F-1 score, for LFR networks of N = 1, 000,on = 300, and om = 5 versus μ.

of the vertices and the structured decomposition of reinforcedegonet-tensor improve robustness of nonnegative factorizationmethods against community coherence, presence of overlappingnodes, as well as rough estimates of the number of communities.

2) EgoTen vs. State-of-the-Art Methods: In this subsection,we compare the performance of EgoTen with state-of-the-artcompetitors, namely Oslom [57], Bigclam [9], Infomap [58],and Louvain [5], and MMSB [10]. Similar to the previous sub-section, we have generated LFR networks with N = 1, 000,γ1 = 2, γ2 = 1 and om = {2, 3, 5}. Number of overlappingnodes on as well as mixing parameter μ are varied in the range[10, 600] and [0.1, 0.7], respectively. Unless otherwise stated,threshold parameter for assigning hard community member-ships of EgoTen is chosen as1 τ = 1/K.

Performance is reported in terms of NMI, and F1-score. AsFigs. 7 and 8 corroborate, EgoTen offers the highest robustnessfor a wide range of μ, as well as on , thanks to the reinforcedstructure of the egonet-tensor.

Furthermore, to investigate the influence of parameters K(and τ ) on the performance of Bigclam, MMSB, and Egoten,Fig. 9 depicts NMI and F1-score for three choices of threshold-ing while κ in K = κ|C| is varied in [1, 3]. The three choices ofthresholding are as follows: (a) τ = 1/K, (b) τcond is selectedby exhaustive search over [0.01, 0.5] with a discretization of0.01, and the value corresponding to the minimum conductance

1Since we have imposed a simplex constraint over the K association indicesfor any given node, τ = 1/K could be interpreted as having an association indexhigher than an equal association with all detected communities. Other selectionschemes for τ , for instance setting to the value providing the community coverwith the smallest average conductance, are also viable.

Fig. 8. Performance of EgoTen and state-of-the-art algorithms in terms of(a) NMI; and (b) average F1-score, for LFR networks of N = 1, 000, μ = 0.2,and om = 5 versus on .

Fig. 9. Performance of EgoTen, Bigclam, and MMSB vs κ in K = κ|C|, interms of (a) NMI; and (b) average F1-score, in LFR network of N = 1,000,μ = 0.2, om = 5, and on = 200.

Page 11: Identification of Overlapping Communities via Constrained ... · networks such as brain [1], trend analysis in social media [2], [3], and clustering of costumers in recommender systems

5740 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 66, NO. 21, NOVEMBER 1, 2018

Fig. 10. Variance of NMI via EgoTen versus μ and κ averaged in LFR net-works of N = 1, 000, om = 2, and on = 200.

Fig. 11. Scalability of different algorithms as N grows.

Fig. 12. NMI for the detected communities across time for synthetic time-varying graphs.

is finally selected, and (c) τoracle corresponds to the selectionof τ giving the highest NMI. The last choice is assumed to begiven by an oracle who knows the ground truth, and is providedto benchmark performance. As the plot suggests, while MMSB-based approaches provide interesting modeling of the problemand have attracted a lot of attention in the past years, their per-formance is highly influenced by how good of an estimate onehas on the number of communities K. Thus, MMSB-modelingprovides high accuracy when K � |C|, while its performancedegrades dramatically as the mismatch increases. In contrast,EgoTen, followed by Bigclam, provide higher level of robust-ness, which we attribute to the enhanced structure of the egonettensor inherent to its construction. To illustrate the effect ofinitialization, variance of the algorithm in terms of NMI is pro-vided by the error-bar plot in Fig. 10, versus different values ofμ and κ.

Regarding the scalability of the proposed EgoTen algorithm,Fig. 11 depicts run-time versus network size N while average

Fig. 13. Time-varying community association of nodes with transition timesdrawn from N (10, 1). Communities are color-coded; thus, for a given t on thex-axis, nodes in a community have the same color in (a) ground truth; and resultsvia (b) EgoTen; (c) NMF; and (d) 3D-tensor decomposition.

Fig. 14. Visualization of the American College Football Network withN = 115 and K = 12. Different colors correspond to different detected com-munities, and the pie-charts reveal soft community association of the nodes.

nodal degree is d = 20. The experiment was run on an Intel (R)Core (TM) i7 4.00 GHz CPU with 8 cores and 32 GB of RAM.EgoTen, Bigclam, Louvain, and MMSB take effective use ofparallelization in their implementation. As the plot corroborates,the introduced redundancy in EgoTen can be fairly alleviated viaexploitation of sparsity as well as parallelization, thus endowingit with scalability, while the same cannot be claimed for all othercompetitors.

Page 12: Identification of Overlapping Communities via Constrained ... · networks such as brain [1], trend analysis in social media [2], [3], and clustering of costumers in recommender systems

SHEIKHOLESLAMI AND GIANNAKIS: IDENTIFICATION OF OVERLAPPING COMMUNITIES VIA CONSTRAINED EGONET TENSOR 5741

Fig. 15. Maximum-conductance versus coverage for real-world networks with underlying community structure. Lower curves correspond to better performance.

B. Time-Varying Graphs

In this subsection, the performance of the proposed EgoTenin Algorithm 4 for community identification over time-varyingnetworks is assessed. To this end, we have generated a syn-thetic network with N = 1, 000 nodes for a span of T = 20slots. Initially at t = 1, the networks is generated with two dis-tinct communities, each containing of 500 nodes. For t > 1,the community association of 600 randomly selected nodes re-mains unchanged, whereas the other 400 nodes migrate fromtheir original community to a third newly-formed community.Transition slot τn for each of these nodes is identically drawnfrom a normal distribution N (10, 2). For any time slot t, thenetwork edges are drawn according to a block stochastic model,where nodes within the same community are connected withprobability 0.3, and out-of-community edges are drawn withprobability 0.1. EgoTen’s performance is compared with that ofconstrained NMF, for which U and V per t is used as initial-ization for t + 1 to provide NMF with consistency across time.We also provide the results of NMF when no initialization isutilized. Furthermore, performance is compared with the resultof community detection via rank-K tensor decomposition of a3D-tensor; the t-th slab of the tensor is the adjacency matrix ofthe network at time t. In addition to the nonegativity constraints,first and second factors are regularized with Frobenius norm toresolve scalar ambiguity, and the rows of the third term is sub-ject to simplex constraints. This comparison is to highlight theadvantage of the proposed enhanced network representation viathe tensor of egonets.

Performance is measured in terms of NMI, and it is averagedover 100 realizations of the network in Fig. 12. Furthermore,Fig. 13 illustrates the identified communities for different nodesacross time for a realization. That is, for any t on the x-axis,nodes associated with the same community are shown with the

TABLE IREAL-WORLD NETWORKS

same color. The plot depicts ground truth as well as 3D-tensor,EgoTen and NMF results, where we have perturbed ordering ofnodes for a better visualization. Clearly, EgoTen successfullyidentifies the two initial communities, as well as three commu-nities after the migration of a subset of nodes, presenting solidblocks similar to those in the ground truth, while communitiesdetected via constrained NMF and 3D-tensor decomposition areof lower quality.

C. Real World Networks

In this section EgoTen is utilized for performing commu-nity detection on a number of real-world networks, tabulated inTable I. To study the quality of detected communities for real-world networks -whose ground truth community association isoften unavailable- we examine the conductance of detected com-munities. In particular, given a cover C = {C1 , ..., CK }, let uscompute the conductance φ(Ci) for i = 1, . . . ,K. Correspond-ing to a value ν ∈ [0 1], let us define the set of communitieswhose conductance is less than ν, i.e., Sν := {Cj |φ(Cj ) < ν}.Then, coverage(ν) is defined as

coverage(ν) :=

∣∣∣∪Ci ∈ SνCi∣∣∣

N(21)

Page 13: Identification of Overlapping Communities via Constrained ... · networks such as brain [1], trend analysis in social media [2], [3], and clustering of costumers in recommender systems

5742 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 66, NO. 21, NOVEMBER 1, 2018

TABLE IIQUALITY OF DETECTED COVER ON REAL-WORLD NETWORKS IN TERMS OF AUC AND AVERAGE CONDUCTANCE

Consequently, conductance-coverage curve is plotted by vary-ing the value ν from 0 to 1 on the y-axis and reporting thecorresponding coverage value on the x-axis. As low values ofconductance correspond to more cohesive communities, smallerarea under curve (AUC) implies better performance. Fig. 15plots the coverage-conductance curve and Table II tabulatesAUC as well as average conductance defined in (16). SinceLouvain does not allow for overlapping nodes, the two met-rics coincide, hence the corresponding one column in Tabel II.As the results corroborate, the quality of detected communi-ties via EgoTen is closely competing or outperforming the onesprovided by other methods, while it remains robust to “resolu-tion limit” [59] observed in Oslom in Fig. 15(a) and Infomap inFig. 15(d) whose performance are limited to detecting only largecommunities.

VIII. CONCLUSION AND REMARKS

By viewing networks as the union of nodal egonets, a noveltensor-based representation for capturing high-order nodal con-nectivities has been introduced. The induced redundancy in theconstructed egonet-tensor bestows the novel representation withrich structure, and is utilized for community detection by castingthe problem as a constrained tensor decomposition task. Utiliza-tion of tensor sparsity as well as parallel computation endow thealgorithm with scalability, while the structured redundancy en-hances the performance against overlapping and highly-mixedcommunities. The proposed framework is broadened to accom-modate time-varying graphs, where a four-dimensional tensorenables simultaneous community identification over the entirehorizon yielding an improved performance.

As a natural extension, one can generalize the tensor-basedrepresentation to account for adjacency matrices capturing theconnectivity of 2, 3, . . . , dmax -hop neighbors. This approach in-deed highlights the tradeoff between flexibility and redundancy,as memory and computational intensity of the correspondingCPD as well as proper tuning of parameters will influence thequality of the detected communities. One could analyze thistradeoff to further characterize how the quality of detected com-munities evolves as the coverage of extended-egonets increases;however, this goes beyond the scope of this work, and is left forfuture investigation.

APPENDIX A

Before proving Proposition 2, we will establish the followingLemma.

Lemma: For the N ×N matrix

M =

[α1N/2×N/2 , β1N/2×N/2

β1N/2×N/2 , β1N/2×N/2

](22)

with 0 < β < α, it holds that rank(M) = 2. Furthermore, forthe two eigenvectors λ1(M) > λ2(M) > 0, we have:

i) λ1(M) > Nα/2;ii) λ2(M) < Nβ/2;iii) the eigen-gap is at least λ1(M)−λ2(M)>N(α−β)/2.Proof of Lemma: The claim on the rank is due to the decom-

position

M = (α− β)[11×N/2 ,01×N/2 ][11×N/2 ,01×N/2 ]

+ β11×N 11×N .

Furthermore, due to the symmetry of M, the two eigenvectorstake the form

v = [γ11×N/2 , η11×N/2 ].

Solving Mv = λv for λ yields

λ2 − N

2(β + α)λ +

N 2

4β(α− β) = 0

from which the two eigenvalues are

λ1,2(M) =N

4(α + β)± N

4

√(α + β)2 − 4β(α− β) > 0.

Thus,

λ1(M) =N

4(α + β) +

N

4

√(α− β)2 + 4β2 ≥ N

2α,

λ2(M) =N

4(α + β)− N

4

√(α− β)2 + 4β2 ≤ N

where we have used the inequality√

(α− β)2 + 4β2 ≥ α− βfor α > β. Clearly, their difference satisfies

λ1(M)− λ2(M) >N

2(α− β).

Furthermore, the eigenvector v1 corresponding to the top eigen-value λ1 is

v1 = [γ11N/2×1 , η11N/2×1 ]

where using Mv1 = λ1v1 , λ1 ≥ Nα/2, and given α > 2β

η1

γ1=

2λ1 − βN<

β

α− β< 1.

Thus selecting the N/2 largest-amplitude entries of v1 unravelsthe member nodes in community C1 . �

Proof of Proposition 2: To start, consider the structure of theegonet adjacency of nodes in community C1 . Without loss ofgenerality, suppose that the vertices are indexed such that C1 ={v1 , v2 , . . . , vN/2}, C2 = {vN/2+1 , . . . , vN }, and partition theadjacency W1 into four submatrices each of size N/2×N/2, as

W1 =[Ω11 ,Ω12Ω21 ,Ω22

]. (23)

Page 14: Identification of Overlapping Communities via Constrained ... · networks such as brain [1], trend analysis in social media [2], [3], and clustering of costumers in recommender systems

SHEIKHOLESLAMI AND GIANNAKIS: IDENTIFICATION OF OVERLAPPING COMMUNITIES VIA CONSTRAINED EGONET TENSOR 5743

Next, focus on the block Ω11 that captures the edges amongthe within-community neighbors of ‘ego node’ vn ∈ C1 , i.e.,central nodes in an egonet in expectation. For 1 ≤ i, j ≤ N/2and i �= j we have

W1(i, j) = E

⎡⎣ 1N/2

N/2∑n=1

1(w (n)i , j =1)

⎤⎦ =

1N/2

N/2∑n=1

Pr{w(n)i,j = 1}

=1

N/2

(N/2∑

n=1,n �=i,j

Pr{wi,n = 1, wj,n = 1, wi,j = 1}

+ Pr{wi,j = 1}+ Pr{wi,j = 1})

= p3 +4N

p− 4N

p3

where we have split the summation over three parts (i) n =1, . . . , N/2, n �= i, j, (ii) n = i, and (iii) n = j; and have alsoused the fact that since nodes i, j, n ≤ N/2 are in commu-nity C1 , their pair-wise edge probability is equal to p, andindependent. However, the diagonal entries of Ω11 are zero,due to lack of self loops in the graph. Thus, with α :=p3 + 4p/N − 4p3/N , the Ω11 submatrix can be briefly rewrit-ten as Ω11 = α1N/2×11N/2×1 − αIN/2 .

Similarly, the submatrix Ω22 capturing the W1(i, j) entriesfor i, j > N/2 and i �= j yields

W1(i, j) = E

⎡⎣ 1N/2

N/2∑n=1

1(w (n)i , j =1)

⎤⎦ =

1N/2

N/2∑n=1

Pr{w(n)i,j = 1}

=2N

N/2∑n=1

Pr{wi,n = 1, wj,n = 1, wi,j = 1} = pq2

where we have used i, j ∈ C2 and n ∈ C1 . Furthermore, thediagonal entries of Ω22 are zero due to the absence of selfloops. Thus, with β := pq2 , Ω22 can be rewritten as Ω22 =β1N/2×11N/2×1 − βIN/2 .

Finally, regarding entries of Ω12 , the (i, j) entry of W1 for1 ≤ i ≤ N/2 and N/2 < j is by definition

W1(i, j) = E

⎡⎣ 1N/2

N/2∑n=1

1(w (n)i , j =1)

⎤⎦ =

1N/2

N/2∑n=1

Pr{w(n)i,j = 1}

=2N

(N/2∑

n=1,n �=i

Pr{wi,n = 1, wj,n = 1, wi,j = 1}

+ Pr{wi,j = 1})

= pq2 +2N

q − 2N

pq2

where we have used that i, n ∈ C1 and j ∈ C2 , as well as theedge independence. Similar argument holds for the submatrixΩ21 due to symmetry, which can be rewritten as

Ω21 = Ω12 := (β + Δβ)1N/2×11N/2×1

with Δβ := 2N q − 2

N pq2 .

Putting together the unraveled structure of matrix W1 , weobtain W1 = M + ΔM + Γ, where

ΔM :=

[0 Δβ1N/2×11N/2×1

Δβ1N/2×11N/2×1 0

]

Γ :=[−αIN/2 0

0 −βIN/2

].

It can be easily shown that ‖Γ‖2 = α due to β < α, and‖ΔM‖2 ≤ ‖ΔM‖F ≤

√2(q − pq2). Thus, using the Cauchy-

Schwarz inequality, while substituting α and omitting the neg-ative terms, we arrive at

‖W1 −M‖2 ≤ ‖ΔM‖2 + ‖Γ‖2 ≤ p3 +4N

p +√

2q.

Furthermore, following the proof of Lemma 1, in networkswith large N and given p >

√2q, the N/2 largest-amplitude

entries of the eigenvector corresponding to the largest eigenvaluereveal the members of community C1 . �

Intuitively, the 2-norm of matrices ΔM and D are to becompared with the largest eigenvalue of λ1(M) > αN/2. Thesetwo terms can be interpreted as small perturbations on the mainM matrix. Hence, for sufficiently large network, the eigenvaluesand vector pairs are slightly perturbed due to these terms, butthe overall structure of community C1 can be inferred from theeigenvector corresponding to the largest eigenvalue.

APPENDIX B

Proof of Proposition 3: Under arguments similar to those asin the proof of Prop. 2, it can be shown that the expected egonet-adjacency for community C2 is

W2 =

[Ω′11 ,Ω

′1,2

Ω′21 ,Ω′22

]

where Ω′11 = β1N/2×11N/2×1 − βIN/2 , Ω′12 = Ω′21 := (β +Δβ)1N/2×11N/2×1 , Ω′22 = α1N/2×11N/2×1 − αIN/2 , and αand β are defined in Appendix A. Thus using the Lemma withappropriate permutation matrix, W2 can be effectively approx-imated by a rank-2 matrix

M′ = βv0 ◦ v0 + (α− β)v2 ◦ v2

while matrix W1 can be approximated by a rank-2 matrix

M = βv0 ◦ v0 + (α− β)v1 ◦ v1

Building on the common term in M and M′, and utilizing arank-3 CPD triplet of factors

A = B = [√

βv0 ,√

α− βv1 ,√

α− βv2 ],C =[

1, 1, 01, 0, 1

],

Page 15: Identification of Overlapping Communities via Constrained ... · networks such as brain [1], trend analysis in social media [2], [3], and clustering of costumers in recommender systems

5744 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 66, NO. 21, NOVEMBER 1, 2018

the Forbenius cost of rank-3 approximation of tensor W can bebounded as

minW ,rank(W )=3

1N‖W − W‖F

≤ 1N

(‖W1 −Adiag(1, 1, 0)B‖2F

+ ‖W2 −Adiag(1, 0, 1)B‖2F)1/2

≤ 1N

√‖W1 −M‖2F + ‖W2 −M′‖2F

≤√

2N‖W1 −M‖F

≤√

2N

(‖Γ‖F + ‖ΔM‖F

)

≤√

2N

(√N

2(α2 + β2) +

√2N

2Δβ

)

≤√

(p3 + 4p/N)2 + p2q4√

N+

q(1− pq)N

. �

APPENDIX C

Proof of Proposition 4: One can readily rewrite the CPD ob-jective corresponding to W as

1N

N∑n=1

‖W(n) −Adiag(cn )B‖2F

=12

(1

(N/2)

∑vn ∈ C1

‖W(n) −Adiag(cn )B‖2F

+1

(N/2)

∑vn ∈ C2

‖W(n) −Adiag(cn )B‖2F)

≥ 12

(∥∥∥∥∥1

(N/2)

∑vi ∈C1

W(i) −A1

(N/2)

∑vn ∈C1

(diag(cn ))B∥∥∥∥∥

2

F

+

∥∥∥∥∥1

(N/2)

∑vi ∈C2

W(i) −A1

(N/2)

∑vn ∈C2

(diag(cn ))B∥∥∥∥∥

2

F

)

=12

(‖W1 −Adiag(c1)B‖2F + ‖W2 −Adiag(c2)B‖2F

)

≥ arg min{A ,B ,c1 ,c2 }≥0

12

2∑k=1

‖Wk −Adiag(ck )B‖2F (24)

where the first inequality follows from Jenson’s inequality overthe (convex) Frobenius norm square, and in the last line, c1 =

1(N/2)

∑vn ∈ C1 diag(cn ) and c2 = 1

(N/2)

∑vn ∈ C2 diag(cn ).

Minimizing the left hand side with respect to {A,B,C} ≥ 0completes the proof. �

REFERENCES

[1] J. D. Power et al., “Functional network organization of the human brain,”Neuron, vol. 72, no. 4, pp. 665–678, Nov. 2011.

[2] Y. R. Lin, Y. Chi, S. Zhu, H. Sundaram, and B. L. Tseng, “Analyzing com-munities and their evolutions in dynamic social networks,” ACM Trans.Knowledge Discovery Data, vol. 3, no. 2, pp. 8:1–8:31, Apr. 2009.

[3] S. Papadopoulos, Y. Kompatsiaris, A. Vakali, and P. Spyridonos, “Com-munity detection in social media,” Data Mining Knowledge Discovery,vol. 24, no. 3, pp. 515–554, 2012.

[4] P. K. Reddy, M. Kitsuregawa, P. Sreekanth, and S. S. Rao, “A graph-basedapproach to extract a neighborhood customer community for collaborativefiltering,” in Proc. Int. Workshop Databases Netw. Inf. Syst., 2002, pp. 188–200.

[5] V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre, “Fastunfolding of communities in large networks,” J. Statist. Mech., TheoryExp., vol. 2008, no. 10, 2008, Art. no. P10 008.

[6] J. Duch and A. Arenas, “Community detection in complex networks usingextremal optimization,” Phys. Rev. E, vol. 72, no. 2, 2005, Art. no. 027 104.

[7] E. M. Airoldi, D. M. Blei, S. E. Fienberg, and E. P. Xing, “Mixed mem-bership stochastic blockmodels,” in Proc. Adv. Neural Inf. Process. Syst.,Vancouver, Canada, Dec. 2009, pp. 33–40.

[8] A. Anandkumar, R. Ge, D. Hsu, and S. M. Kakade, “A tensor approachto learning mixed membership community models,” J. Mach. Learn. Res.,vol. 15, no. 1, pp. 2239–2312, Jan. 2014.

[9] J. Yang and J. Leskovec, “Overlapping community detection at scale: Anonnegative matrix factorization approach,” in Proc. ACM Int. Conf. WebSearch Data Mining, Rome, Italy, Feb. 2013, pp. 587–596.

[10] X. Mao, P. Sarkar, and D. Chakrabarti, “On mixed memberships andsymmetric nonnegative matrix factorizations,” in Proc. Int. Conf. Mach.Learn., Sydney, Australia, 2017, pp. 2324–2333.

[11] P. K. Gopalan, S. Gerrish, M. Freedman, D. M. Blei, and D. M. Mimno,“Scalable inference of overlapping communities,” in Proc. Adv. NeuralInf. Process. Syst., 2012, pp. 2249–2257.

[12] I. El-Helw, R. Hofman, W. Li, S. Ahn, M. Welling, and H. Bal, “Scalableoverlapping community detection,” in Proc. IEEE Int. Parallel Distrib.Process. Symp. Workshops., Chicago, IL, USA, 2016, pp. 1463–1472.

[13] I. Derenyi, G. Palla, and T. Vicsek, “Clique percolation in random net-works,” Phys. Rev. Lett., vol. 94, no. 16, Apr. 2005, Art. no. 160202.

[14] U. Von Luxburg, “A tutorial on spectral clustering,” Statist. Comput.,vol. 17, no. 4, pp. 395–416, Aug. 2007.

[15] F. Wang, T. Li, X. Wang, S. Zhu, and C. Ding, “Community discoveryusing nonnegative matrix factorization,” ACM Trans. Data Mining Knowl-edge Discovery, vol. 22, no. 3, pp. 493–521, Jul. 2011.

[16] X. Cao, X. Wang, D. Jin, Y. Cao, and D. He, “Identifying overlappingcommunities as well as hubs and outliers via nonnegative matrix factor-ization,” Sci. Rep., vol. 3, 2013, Art. no. 2993.

[17] I. Psorakis, S. Roberts, M. Ebden, and B. Sheldon, “Overlapping commu-nity detection using Bayesian non-negative matrix factorization,” Phys.Rev. E, vol. 83, no. 6, 2011, Art. no. 066114.

[18] Y. Zhang and D.-Y. Yeung, “Overlapping community detection viabounded nonnegative matrix TRI-factorization,” in Proc. ACM Int. Conf.Knowledge Discovery Data Mining, 2012, pp. 606–614.

[19] Z.-Y. Zhang, Y. Wang, and Y.-Y. Ahn, “Overlapping community detectionin complex networks using symmetric binary matrix factorization,” Phys.Rev. E, vol. 87, no. 6, 2013, Art. no. 062803.

[20] S. Fortunato, “Community detection in graphs,” Phys. Rep., vol. 486, no. 3,pp. 75–174, Feb. 2010.

[21] S. Fortunato and D. Hric, “Community detection in networks: A userguide,” Phys. Rep., vol. 659, pp. 1–44, 2016.

[22] J. J. Whang, D. F. Gleich, and I. S. Dhillon, “Overlapping community de-tection using seed set expansion,” in Proc. ACM Int. Conf. Inf. KnowledgeManage., 2013, pp. 2099–2108.

[23] K. He, Y. Sun, D. Bindel, J. Hopcroft, and Y. Li, “Detecting overlappingcommunities from local spectral subspaces,” in Proc. IEEE Int. Conf. DataMining, 2015, pp. 769–774.

[24] J. J. Whang, I. S. Dhillon, and D. F. Gleich, “Non-exhaustive, overlappingk-means,” in Proc. SIAM Int. Conf. Data Mining, 2015, pp. 936–944.

[25] E. E. Papalexakis, L. Akoglu, and D. Ience, “Do more views of a graphhelp? Community detection and clustering in multi-graphs,” in Proc. IEEEInt. Conf. Inf. Fusion, 2013, pp. 899–905.

[26] E. E. Papalexakis, N. D. Sidiropoulos, and R. Bro, “From k-means tohigher-way co-clustering: Multilinear decomposition with sparse latentfactors,” IEEE Trans. Signal Process., vol. 61, no. 2, pp. 493–506, Jan.2013.

Page 16: Identification of Overlapping Communities via Constrained ... · networks such as brain [1], trend analysis in social media [2], [3], and clustering of costumers in recommender systems

SHEIKHOLESLAMI AND GIANNAKIS: IDENTIFICATION OF OVERLAPPING COMMUNITIES VIA CONSTRAINED EGONET TENSOR 5745

[27] J. Yang, J. McAuley, and J. Leskovec, “Community detection in net-works with node attributes,” in Proc. IEEE Int. Conf. Data Mining, 2013,pp. 1151–1156.

[28] M. Araujo et al., “Com2: Fast automatic discovery of temporal (comet)communities,” in Proc. Pac. Asia Conf. Knowledge Discovery Data Min-ing, 2014, pp. 271–283.

[29] B. Baingana and G. B. Giannakis, “Joint community and anomaly track-ing in dynamic networks,” IEEE Trans. Signal Process., vol. 64, no. 8,pp. 2013–2025, Apr. 2016.

[30] F. Huang, U. Niranjan, M. U. Hakeem, and A. Anandkumar, “Onlinetensor methods for learning latent variable models,” J. Mach. Learn. Res.,vol. 16, pp. 2797–2835, 2015.

[31] F. Sheikholeslami, B. Baingana, G. B. Giannakis, and N. D. Sidiropoulos,“Egonet tensor decomposition for community identification,” in Proc.IEEE Global Conf. Signal Inf. Process., Washingto, DC, USA, Dec. 2016,pp. 341–345.

[32] A. R. Benson, D. F. Gleich, and J. Leskovec, “Tensor spectral clusteringfor partitioning higher-order network structures,” in Proc. SIAM Int. Conf.Data Mining, Vancouver, Canada, Feb. 2015, pp. 118–126.

[33] T. G. Kolda, B. W. Bader, and J. P. Kenny, “Higher-order web link analysisusing multilinear algebra,” in Proc. IEEE Int. Conf. Data Mining, 2005,8 pp.

[34] J. Kruskal, “Three-way arrays: Rank and uniqueness of trilinear decom-positions, with application to arithmetic complexity and statistics,” LinearAlgebra Appl., vol. 18, no. 2, pp. 95–138, 1977.

[35] N. Sidiropoulos and R. Bro, “On the uniqueness of multilinear decom-position of N-way arrays,” J. Chemometrics, vol. 14, no. 3, pp. 229–239,Jun. 2000.

[36] L. Chiantini and G. Ottaviani, “On generic identifiability of 3-tensors ofsmall rank,” SIAM J. Matrix Anal. Appl., vol. 33, no. 3, pp. 1018–1037,2012.

[37] L. Akoglu, M. McGlohon, and C. Faloutsos, “Oddball: Spotting anomaliesin weighted graphs,” in Proc. Adv. Knowledge Discovery Data Mining,pp. 410–421, 2010.

[38] J. Leskovec and J. J. Mcauley, “Learning to discover social circles in egonetworks,” in Proc. Adv. Neural Inf. Process. Syst., 2012, pp. 539–547.

[39] C. Lan, Y. Yang, X. Li, B. Luo, and J. Huan, “Learning social circlesin ego-networks based on multi-view network structure,” IEEE Trans.Knowl. Data Eng., vol. 29, no. 8, pp. 1681–1694, Aug. 2017.

[40] T. G. Kolda and B. W. Bader, “Tensor decompositions and applications,”SIAM Rev., vol. 51, no. 3, pp. 455–500, 2009.

[41] E. E. Papalexakis, C. Faloutsos, and N. D. Sidiropoulos, “Parcube: Sparseparallelizable tensor decompositions,” in Proc. Joint Eur. Conf. Mach.Learn. Knowledge Discovery Databases, Bristol, U.K., 2012, pp. 521–536.

[42] S. Smith and G. Karypis, “SPLATT: The Surprisingly ParalleL spArseTensor Toolkit.” [Online]. Available: http://cs.umn.edu/ splatt/

[43] M. Mardani, G. Mateos, and G. B. Giannakis, “Subspace learning andimputation for streaming big data matrices and tensors,” IEEE Trans.Signal Process., vol. 63, no. 10, pp. 2663–2677, May 2015.

[44] J. A. Bazerque, G. Mateos, and G. B. Giannakis, “Rank regularization andBayesian inference for tensor completion and extrapolation,” IEEE Trans.Signal Process., vol. 61, no. 22, pp. 5689–5703, Nov. 2013.

[45] H. Bozdogan, “Model selection and akaike’s information criterion (AIC):The general theory and its analytical extensions,” Psychometrika, vol. 52,no. 3, pp. 345–370, 1987.

[46] J. Rissanen, “Modeling by shortest data description,” Automatica, vol. 14,no. 5, pp. 465–471, 1978.

[47] G. B. Giannakis, Q. Ling, G. Mateos, I. D. Schizas, and H. Zhu, “Decen-tralized learning for wireless communications and networking,” SplittingMethods in Communication, Imaging, Science, and Engineering. Berlin,Germany: Springer, 2016, pp. 461–497.

[48] K. Huang, N. D. Sidiropoulos, and A. P. Liavas, “A flexible and efficientalgorithmic framework for constrained matrix and tensor factorization,”IEEE Trans. Signal Process., vol. 64, no. 19, pp. 5052–5065, Oct. 2016.

[49] J. Duchi, S. Shalev-Shwartz, Y. Singer, and T. Chandra, “Efficient projec-tions onto the l 1-ball for learning in high dimensions,” in Proc. Int. Conf.Mach. Learn., 2008, pp. 272–279.

[50] E. J. Candes and M. B. Wakin, “An introduction to compressive sampling,”IEEE Signal Process. Mag., vol. 25, no. 2, pp. 21–30, Mar. 2008.

[51] M. Razaviyayn, M. Hong, and Z.-Q. Luo, “A unified convergence analysisof block successive minimization methods for nonsmooth optimization,”SIAM J. Optim., vol. 23, no. 2, pp. 1126–1153, 2013.

[52] B. Karrer and M. E. Newman, “Stochastic blockmodels and communitystructure in networks,” Phys. Rev. E, vol. 83, no. 1, Jan. 2011, Art. no.016107.

[53] E. Abbe, “Community detection and stochastic block models: Recentdevelopments,” J. Mach. Learn. Res., vol. 18, no. 177, pp. 1–86, 2018.

[54] A. Lancichinetti, S. Fortunato, and J. Kertesz, “Detecting the overlappingand hierarchical community structure in complex networks,” New J. Phys.,vol. 11, no. 3, 2009, Art. no. 033015.

[55] J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney, “Communitystructure in large networks: Natural cluster sizes and the absence of largewell-defined clusters,” Internet Math., vol. 6, no. 1, pp. 29–123, 2009.

[56] L. Gauvin, A. Panisson, and C. Cattuto, “Detecting the community struc-ture and activity patterns of temporal networks: A non-negative tensorfactorization approach,” PloS One, vol. 9, no. 1, 2014, Art. no. e86028.

[57] A. Lancichinetti, F. Radicchi, J. J. Ramasco, and S. Fortunato, “Findingstatistically significant communities in networks,” PloS One, vol. 6, no. 4,2011, Art. no. e18961.

[58] L. Bohlin, D. Edler, A. Lancichinetti, and M. Rosvall, “Community de-tection and visualization of networks with the map equation framework,”Measuring Scholarly Impact. Berlin, Germany: Springer, 2014, pp. 3–34.

[59] S. Fortunato and M. Barthelemy, “Resolution limit in community detec-tion,” Proc. Nat. Academy Sci., vol. 104, no. 1, pp. 36–41, 2007.

Fatemeh Sheikholeslami (S’13) received the B.Sc.and M.Sc. degrees in electrical engineering from theUniversity of Tehran, Tehran, Iran, and the SharifUniversity of Technology, Tehran, Iran, in 2010 and2012, respectively. Since September 2013, she hasbeen working toward the Ph.D. degree at the Depart-ment of Electrical and Computer Engineering, Uni-versity of Minnesota, Minneapolis, MN, USA. Herresearch interests include machine learning and net-work science.

Georgios B. Giannakis (F’97) received the Diplomadegree in electrical engineering from the NationalTechnical University of Athens, Athens, Greece,1981 and the M.Sc. degree in electrical engineering in1983, the M.Sc. degree in mathematics in 1986, andthe Ph.D. degree in electrical engineering in 1986from University of Southern California (USC), CA,USA.

From 1982 to 1986, he was with the Universityof Southern California, Los Angeles, CA, USA, Hewas with the University of Virginia from 1987 to

1998, and since 1999, he has been a Professor with the University of Min-nesota, Minneapolis, MN, USA, where he holds an Endowed Chair in WirelessTelecommunications, a University of Minnesota McKnight Presidential Chairin Electrical and Computer Engineering, and is the Director of the Digital Tech-nology Center. His general interests include communications, networking, andstatistical learning—on which he has authored or coauthored more than 430 jour-nal papers, 720 conference papers, 25 book chapters, two edited books, and tworesearch monographs (h-index 133). Current research focuses on data science,Internet of things, and network science with applications to social, brain, andpower networks with renewables. He is the (co-) inventor of 32 patents issued,and the (co-) recipient of nine best journal paper awards from the IEEE SignalProcessing (SP) and Communications Societies, including the G. Marconi PrizePaper Award in Wireless Communications. He was the recipient of TechnicalAchievement Awards from the SP Society (2000), from EURASIP (2005), aYoung Faculty Teaching Award, the G. W. Taylor Award for Distinguished Re-search from the University of Minnesota, and the IEEE Fourier Technical FieldAward (inaugural recipient in 2015). He is a Fellow of EURASIP, and has servedthe IEEE in a number of posts, including that of a Distinguished Lecturer forthe IEEE-SPS.