Presenter: Guoliang Liu Date:4/25/2012. Background Introduction Definition Basic idea of partition...

20
Community structure and detection in complex networks Presenter: Guoliang Liu Date:4/25/2012

Transcript of Presenter: Guoliang Liu Date:4/25/2012. Background Introduction Definition Basic idea of partition...

  • Slide 1

Presenter: Guoliang Liu Date:4/25/2012 Slide 2 Background Introduction Definition Basic idea of partition Quality Function Classification Based On Algorithms Benchmarks Applications Conclusion Slide 3 Real-world networks: Biological Network Social Network Technical network The small world effect Transitivity Degree distributions Network resilience Mixing patterns Degree correlations Community structure Network navigation Others such as Self- similarity Slide 4 What is Community Structure? Fig.1(Girvan and Newman, 2004) Slide 5 Definition of Community Structure Note: no universally accepted definition. Basic idea: Random graph does not have community structure. As to non-random graph, there must be more edges inside the community than edges linking vertices of the community with the rest of the graph. Maximize Slide 6 How can we do partition in the graph? Partitions can be hierarchically ordered, when the graph has different scales. In this case, clusters display in turn community structure, with smaller communities inside, which may contain smaller communities and so on. Hierarchical Structure Slide 7 Most well-known Metric: Modularity: How different is this graph from a random graph Slide 8 Classification Hierarchical Structure Divisive Algorithms Foundation work: Girvan and Newman,2002/2004 Modification of GN divisive method Representation Work: Tyler et al., 2003; Wilkinson and Huberman, 2004;Zhou et al., 2006; Chen and Yuan, 2006.; Rattigan et al., 2007; Pinney and Westhead, 2006 Agglomerative Algorithms Foundation work: Newman, 2004 Modification of GN agglomerative and new proposed agglomerative methods Representation Work: Zhenqing Ye et al, 2008 Vincent D. Blondel et al,2009 Nam P. Nguyen et al,2011 Non-hierarchical Constructive Algorithms Relatively new approaches Representation Work: D.Shah and T.Zaman,2010; R.R.Khorasgani et al, 2010 Rushed Kanawati, 2011 Optimization approach Representation Work: S.Li, et al, 2010 D.Jin et al,2010 C.Shi et al, 2010 Thang N.Dinh et al,2011 Slide 9 Hierarchical Structure: Fig.2(Girvan and Newman, 2004) AgglomerativeDivisive Slide 10 Foundation work (Girvan and Newman, 2004) 1. Calculate betweenness scores for all edges in the network. 2. Find the edge with the highest score and remove it from the network. 3. Recalculate betweenness for all remaining edges. 4. Repeat from step 2. How to measure betweenness Shortest path Random walk Current-flow Slide 11 Foundation work(Newman 2004) Based on modularity Q At first, treat each node as a single community. Calculate Modularity of each pair of two neighboring communities. Find the largest gain of Modularity and merge this two communities to one. Iteratively do the second step, until we get only one community. Find the largest Modularity in some level Slide 12 Newman 2004, continue Slide 13 Fast unfolding of communities in large networks (Vincent D. Blondel,2009) Modification of Newman fast algorithm,2004. Take use of another property of complex networks: Self-similarity (Treat each community as a single node). Different from Newman 2004, every iteration treats each community as a single node. Advantages: Much faster when calculating modularity of each merged communities. Slide 14 Vincent D. Blondel, 2009, Continue. Slide 15 GN benchmark(Girvan and Newman, 2004) Derived from planted l-partition model Benchmark Graphs consist of 128 nodes with expected degree 16, which are divided into four groups of size 32 each. Slide 16 LFR benchmark Compared with GN benchmark, LFR benchmark takes degree distribution with power law principle into account, which is another property of complex networks. Hence, LFR benchmark is more practical to test detection algorithms. Slide 17 More information about benchmark : http://www.cs.gsu.edu/~gliu6/courseCSC8530.h tml Slide 18 Clustering Web clients: users who have similar interests and are geographically near to each other may improve the performance of services provided on the World Wide Web Clusters of large graphs: can be used to create data structures in order to efficiently store the graph data and to handle navigational queries, like path searches Data dissemination in Mobile social networks: How to find most influential nodes. Processors allocation in parallel computing: it is crucial to know what is the best way to allocate tasks to processors so as to minimize the communications between them and enable a rapid performance of the calculation. Slide 19 Community detection has been studied for a long time and since real-world complex networks development, community detection is still a popular topic in all kinds of fields such as economy, physics and computer science. Slide 20 [6] M. Girvan and M. E. J. Newman, Community structure in social and biological networks,[ Proc. Nat. Acad. Sci. USA, vol. 99, no. 12, pp. 78217826, Jun. 11, 2002. [7] M. E. J. Newman and M. Girvan, Finding and evaluating community structure in networks,[ Phys. Rev. E, vol. 69, 026113, 2004. [9] M. E. J. Newman, Fast algorithm for detecting community structure in networks, [Phys. Rev. E, vol. 69, 066133, 2004. [10] A. Clauset, M. E. J. Newman, and C. Moore, Finding community structure in very large networks, [ Phys. Rev. E, vol. 70, 066111, 2004. [11] V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre, Fast unfolding of communities in large networks,[ J. Stat. Mech., Theory Exp., 2008, DOI: 10.1088/ 1742-5468/2008/10/P10008. [12] M. Saravanan, G. Prasad, K. Surana, and D. Suganthi, Labeling communities using structural properties,[ in Proc. Int. Conf.Adv. Social Netw. Anal. Mining, Aug. 2010, pp. 217224. [39] Kanawati, R. ; LICOD: Leaders Identification for Community Detection in Complex Networks, Privacy, security, risk and trust (passat), 2011 ieee third international conference on and 2011 ieee third international conference on social computing (socialcom), P577 582, 9-11 Oct. 2011. [40] Nguyen, N.P. ; Dinh, T.N. ; Nguyen, D.T. ; Thai, M.T. ; Privacy, security, risk and trust (passat), 2011 ieee third international conference on and 2011 ieee third international conference on social computing (socialcom), P35 - 40, 9-11 Oct. 2011. [41] D. Shah and T. Zaman, Community detection in networks: The leaderfollower algorithm, in Workshop on Networks Across Disciplines in Theory and Applications, NIPS, November 2010. [42] R. R. Khorasgani, J. Chen, and O. R. Zaiane, Top leaders community detection approach in information networks, in 4th SNA-KDD Workshop on Social Network Mining and Analysis, Washington D.C., July 2010. [43]Nam P.Nguyen, Thang N.Dinh, Dung T.Nguyen, My T. Thai, overlapping community structures and their detection on social networks, 2011 ieee third international conference on and 2011 ieee third international conference on social computing (socialcom), P 35 40, 9-11 Oct. 2011