Graph Symmetry and Social Network Anonymization

Post on 01-Dec-2021

6 views 0 download

Transcript of Graph Symmetry and Social Network Anonymization

Symmetry in Complex Networks

Graph Symmetry and Social Network Anonymization

Yanghua XIAO (肖仰华)

School of computer science Fudan University

For more information, please visit http://gdm.fudan.edu.cn

Graph isomorphism determination

Symmetry in Complex Networks

An isomorphism of graphs G and H is a bijection between the vertex sets of G and H that preserve the adjacency of two graphs

László Babai 2015 ACM Knuth Prize Winner

Graph Isomorphism (GI) problem is solved in quasipolynomial time:

Symmetry in Complex Networks

Symmetry

n  Symmetry: Invariance under a group of transformations (Wey.H)

n  Key issues: ¡  Invariance ¡  Transformations

Symmetry in Complex Networks

Graph Symmetry

n  Invariance of adjacency of vertices under the permutations on vertex set.

n  'invariance' :the relation among the vertices

n  'transformation‘: is the permutation on vertex set.

Symmetry in Complex Networks

Formalizations

n  Permutation n  Automorphism n  Automorphism Group n  Structural Equivalent n  Orbit (trivial,non-trival) n  Automorphism Partition n  Vertex Invariant

Symmetry in Complex Networks

Why symmetry is so important?

n  Symmetry vs Complexity ¡  Taming complexity in nature and society are the major task

of 21st century, and "complexity can then be characterized by lack of symmetry or 'symmetry breaking'" (F. Heylighen, 1996)

n  Evolution caused by Symmetry Breaking ¡  The universal evolution is caused by symmetry break,

generating diversity and increasing complexity and energy ( Mainzer K 2005; Quack 2003).

Symmetry in Complex Networks

Outline

n  What is symmetry? n  Application

¡  Network Model ¡  Network Measurement ¡  Network Simplification ¡  Shortest path index ¡  Social network anonymization

Symmetry in Complex Networks

Symmetry in the complex networks

n  Surprising! Real networks are symmetric! n  Traditional Belief: 'almost all graphs are

asymmetric'

B. D. MacArthur, R. J. Sánchez-García, and J. W. Anderson, Symmetry in Complex Networks, Discr. Appl. Math, doi:10.1016/j.dam.2008.04.008 2008, in press.

Symmetry in Complex Networks

Outline

n  What is symmetry? n  Application

¡  Network Model ¡  Network Measurement ¡  Network Simplification ¡  Shortest path index ¡  Social network anonymization

Symmetry in Complex Networks

Emergence of symmetry in complex neworks

Symmetry in Complex Networks

Origination of symmetry in real networks

n  Similar Linkage Pattern ¡  nodes having similar property such as

degree, tend to have similar neighbors. ¡  Exact, vs non-exact

n  (Generalized) Symmetry Bicliques

Symmetry in Complex Networks

Model based on SLP

n  Example: when you first time join a social network, you not only link to stars (hubs), but also want to link to those accounts to whom most of your friends already in the network link

n  Preferential attachment with similar linkage pattern

Symmetry in Complex Networks

Simulation of SLP network n  With SLP, we can reproduce the network

symmetry

Symmetry in Complex Networks

Outline

n  What is symmetry? n  Application

¡  Network Model ¡  Network Measurement ¡  Network Simplification ¡  Shortest path index ¡  Social network privacy protection

Symmetry in Complex Networks

Symmetry-based structure entropy of complex networks

Symmetry in Complex Networks

How to quantify the complexity of a graph? n  Observation:

¡  Vertex with the same degree can be distinguished by many other metrics.

¡  Vertex within a cell of authomorphism partitioning cannot be distinguished be any structural metrics

n  Automorphism Partition is finer than Degree Partition

Automorphism partitions of networks can naturally partition the vertex set into structurally equivalent cells.

Symmetry in Complex Networks

Heterogeneity of real networks

From the symmetry perspective, the graph exhibits completely different compelxity.

Symmetry in Complex Networks

Outline

n  What is symmetry? n  Application

¡  Network Model ¡  Network Measurement ¡  Network Simplification ¡  Shortest path index ¡  Social network privacy protection

Symmetry in Complex Networks

Network Quotients: structural skeletons of complex networks Comments from one of PRE referee: ‘This manuscript makes a substantial contribution to the topic of finding the "structural skeleton" of complex networks, a topic of considerable interest in the network science literature.’

Symmetry in Complex Networks

Motivation

Symmetry in Complex Networks

Network Quotient n  Quotient :G/Aut(G) n  Simplified quotient: s quotient

¡  Coarse-graining each orbit into a vertex, preserving the adjacency between orbits

Symmetry in Complex Networks

Quotients of real graphs n  Quotient graph is significantly smaller than its

real graphs n  Many symmetric motifs contribute to the

simplication

Symmetry in Complex Networks

Structural skeleton n  Mean geodesic distance: m n  Diameter:D n  Clustering coefficient C

Graph quotient preserves most structural properties of the original graph.

Symmetry in Complex Networks

Outline

n  What is symmetry? n  Application

¡  Network Model ¡  Network Measurement ¡  Network Simplification ¡  Shortest path index ¡  Social network privacy protection

Symmetry in Complex Networks

Shortest Path Index n  Motivation

¡  Shortest path trees can be used as index to answer shortest distance query

¡  O(n2) space complexity, unacceptable to big graphs

n  How to reduce the storage cost of BFS_trees?

Yanghua Xiao,Wentao Wu, JianPei, Wei Wang, Zhenying He, Efficiently Indexing Shortest Path by Exploiting Symmetry in Graphs, EDBT 2009, comment from referees: “A pioneer work”

Symmetry in Complex Networks

BFS-trees mapping to each other

n  Many shortest path trees can be mapped to each other by automorphisms

Symmetry in Complex Networks

Our Storage Solution n  Instead of storing all shortest path trees, we store a

single shortest path tree for each orbit, and corresponding automorphisms

n  Automorphism: g1 = (v1; v2), g2 = (v5; v6)(v7; v9)(v8; v10), g3 = (v7; v8), and g4 = (v9; v10).

Symmetry in Complex Networks

Compact BFS-trees n  Each shortest path tree can be

compressed further by its quotient

Symmetry in Complex Networks

Speedup of by query SP from compact BFS index

Our solution can significantly reduce the index size and speedup the query answering on the compressed index structure.

Symmetry in Complex Networks

Outline

n  What is symmetry? n  Application

¡  Network Model ¡  Network Measurement ¡  Network Simplification ¡  Shortest path index ¡  Social network anonymization

Wentao Wu, Yanghua Xiao et.al, K-Symmetry Model for Identity Anonymization in Social Networks, EDBT’2010, March 22-26, Lausanne, Switzerland

Naïve Anonymization and Structural Re-identification (SR)

n  Naïve Anonymization: replacing name with id

n  Structural Re-identification (SR)

¡  Suppose the adversary knows some structural knowledge P : “Bob has at least 4 neighbors”, then only node 2 satisfies P and Bob will be re-identified.

K-symmetry model n  Automorphism partition of a graph

n  Vertices within the same orbit cannot be distinguished with

each other by any structural knowledge. n  If each orbit contains at least k vertices, then the probability

that an individual could be re-identified under any structural knowledge is at most 1/k.

Vertices with the same color belong to the same cell (orbit) of the automorphism partition. O = {{1, 3}, {2}, {4, 5}, {6, 8}, {7}}

Orbit Copying Operation •  We achieve K-Symmetry by introducing orbit copying

operations •  Copying vertices in a orbit as well as their adjacency

enough times until each orbit contains at least k vertices

How to Use the Anonymized Network n  Graph Utility: Scientists are usually interested in the

network’s statistical properties, such as degree distribution, average shortest path length, and so on.

n  Observation: The anonymized graph although is quite different from the original graph, but they share the same graph backbone.

n  Solution ¡  Step 1: recover the graph backbone ¡  Step 2: sample other elements under the constraint derived

from the prior knowledge about the original graphs, such as the number of vertices

Utility Results

Backbone based sampling solution can recover most statistic information of the original graph such as degree, path length, transitivity, network resilience

Symmetry in Complex Networks

References n  Yanghua Xiao and Hua Dong, Li Jin, Wei Wang, Momiao Xiong., Evolution of Structure of

Metabolic Networks, ICSB2007(8th International Conference on Systems Biology 2007 ,Hiroshima, Japan )

n  Yanghua Xiao, Ben.D.MacArthur, Hui Wang, Momiao Xiong, Wei Wang. Network Quotient: Structural Skeletons of Complex Systems. Physical Review E, Vol 78,Page: 046102, 2008,

n  Yanghua Xiao, Wentao Wu, Hui Wang, Momiao Xiong, Wei Wang. Symmetry-based Structure Entropy of Complex Networks. Physica A, 2008, Vol.387, Page:2611-2619.

n  Yanghua Xiao, Momiao Xiong, Wei Wang, Hui Wang. Emergence of Symmetry in Complex Networks. Physical Review E, Vol.77, Issue.6, 2008, page: 066108

n  Yanghua Xiao, Hua Dong, , Wentao Wu ,Momiao Xiong, Wei Wang,Baile Shi, Structure-based Graph Distance Measures of High Degree of Precision. Pattern Recognition, Vol 41, 2008, page: 3547 - 3561

n  Hui Wang, Guangle Yan, Yanghua Xiao. Symmetry in World Trade Network , Journal of Systems Science and Complexity, 2008.

n  Yanghua Xiao,Wentao Wu, JianPei, Wei Wang, Zhenying He, Efficiently Indexing Shortest Path by Exploiting Symmetry in Graphs. EDBT 2009

n  Hua Dong and Yanghua Xiao, Wei Wang, Li Jin, Momiao Xiong, Symmetry in Metabloic Networks . Journal of Computer Science and System Biology.

n  B. D. MacArthur, R. J. Sánchez-García, and J. W. Anderson, Symmetry in Complex Networks, Discr. Appl. Math, doi:10.1016/j.dam.2008.04.008 2008, in press.

n  Wentao Wu, Yanghua Xiao, Wei Wang, Zhenying He and Zhihui Wang, K-Symmetry Model for Identity Anonymization in Social Networks (pdf), 13th International Conference on Extending Database Technology (EDBT’10),

Our other works n  Real graphs are very big

¡  How to efficiently and effectively manage and analyze these big graphs (even with billions of nodes) ?

¡  Shortest distance query (VLDB2014), big graph systems(SIGMOD12),Overlapping community search(SIGMOD2013), Community search(SIGMOD2014)、Big graph partitioning(ICDE2014)

n  Real graphs are sematic rich

¡  How to construct knowledge graphs (KG) and how to use them in search, recommendation,and inference?

¡  Summarization by KG (IJCAI2015)、Recommendation using KG(WWW2014、DASFAA2015), Verb-centric KG (AAAI2016)、、User profiling by KG (ICDM2015、CIKM2015)、Knowledge Reorganization (CIKM 2014), Categorization by KG (CIKM 2015)

n  Our team: http://gdm.fudan.edu.cn n  Our systems and knowledge bases http://kw.fudan.edu.cn

What is the foundation of data science? n  Data understanding: Enable machine to

understand data without supervision n  First, data understanding is the prerequisite of all other

data processing techniques n  Second, many other challenges (such heterogeneity,

big scale) can be considered as an obstacle of machine data understanding

n  Third, free your brains instead of just free your hands

¡  Challenge n  What is understanding? How to model machine’s

understanding? n  What is the limit of machine’s understanding of data?

Will we humans be replaced by machines?

Symmetry in Complex Networks

Symmetry in Complex Networks

Q&A

Thanks!