IBM Confidential 1 HiMap: Adaptive Visualization of Large- Scale Online Social Networks Lei Shi, Nan...

20
1 IBM Confidential HiMap: Adaptive Visualization of Large-Scale Online Social Networks Lei Shi, Nan Cao, Shixia Liu, Weihong Qian, Li Tan, Guodong Wang Jimeng Sun and Ching-yung Lin IBM Research

Transcript of IBM Confidential 1 HiMap: Adaptive Visualization of Large- Scale Online Social Networks Lei Shi, Nan...

Page 1: IBM Confidential 1 HiMap: Adaptive Visualization of Large- Scale Online Social Networks Lei Shi, Nan Cao, Shixia Liu, Weihong Qian, Li Tan, Guodong Wang.

1IBM Confidential

HiMap: Adaptive Visualization of Large-Scale Online Social Networks

Lei Shi, Nan Cao, Shixia Liu, Weihong Qian, Li Tan, Guodong Wang

Jimeng Sun and Ching-yung Lin

IBM Research

Page 2: IBM Confidential 1 HiMap: Adaptive Visualization of Large- Scale Online Social Networks Lei Shi, Nan Cao, Shixia Liu, Weihong Qian, Li Tan, Guodong Wang.

2IBM Confidential

Online Social Network is Prevailing…

• Commercial Online Social Networks– Worldwide: Facebook, MySpace, Wikipedia;– In China: Kaixin001.com, Xiaonei.com;– Vertical web 2.0 websites;

• Enterprise Social Networks– IBM Smallblue: A social networking application to build,

maintain and utilize the intra-company expert network;

Page 3: IBM Confidential 1 HiMap: Adaptive Visualization of Large- Scale Online Social Networks Lei Shi, Nan Cao, Shixia Liu, Weihong Qian, Li Tan, Guodong Wang.

3IBM Confidential

The Subsequent Huge Graph

• Large– Facebook: 200M+ nodes (user);– MySpace: An available graph with 8.8M+ edges (Friendship);– Smallblue: 25k+ nodes (employee), 140k+ edges (Rel.);

• Other Characteristics (Cited from IMC’07 best paper)– Power-law;– Small world;– Scale-free;– Densely connected core plus small clusters;

• Visualization is useful for both admin and end user– Overview;– Guided navigation;– Advanced analysis;

Page 4: IBM Confidential 1 HiMap: Adaptive Visualization of Large- Scale Online Social Networks Lei Shi, Nan Cao, Shixia Liu, Weihong Qian, Li Tan, Guodong Wang.

4IBM Confidential

Outline

• Problem• Solution

– Clustered graph visualization;– Leverage user interaction and navigation;

• Technical Details– Adaptive graph summarization and loading;– Stabilized clustered graph layout algorithm;– Customized user interactions and animations;– Experimental results;

• Related Works• Summary• A Preliminary Demo

Page 5: IBM Confidential 1 HiMap: Adaptive Visualization of Large- Scale Online Social Networks Lei Shi, Nan Cao, Shixia Liu, Weihong Qian, Li Tan, Guodong Wang.

5IBM Confidential

Problem• Problem -- How to visualize the huge graph to

satisfy:– Readability: Each graph view of the network should be visualized in a readable manner that is

easy to be comprehended, independent with its scale, topology and the screen size to display;

– Reachability: A suite of navigation methods should be provided so that it is capable to visualize and diagnose every detail of the network;

– Stability: Smooth animations should be presented between any view changes, so as to keep user’s momentum (mental map);

– Responsibility: The visualization system should run fast enough and keep lightweight: it could catch up with the animation speed and load social network data incrementally and on-demand;

• Technical Challenges (Trade-Offs):– Visualization complexity: Information volume V.S. Human perception

capacity (readability);

– Data loading: Memory usage V.S. Responsive time;

– Graph layout: Layout quality (readability) V.S. Responsive time;

– Layout stability: Graph readability V.S. Stability;

Page 6: IBM Confidential 1 HiMap: Adaptive Visualization of Large- Scale Online Social Networks Lei Shi, Nan Cao, Shixia Liu, Weihong Qian, Li Tan, Guodong Wang.

6IBM Confidential

Solutions: HiMap Overview• Clustered Graph Visualization:

– Social network processes highly clustered and self-similar community structure;

– Clustered structure enables semantic abstraction of large-scale graphs;

• Adaptive Graph Summarization and Loading:– Clustering -> Item ranking ->

Online item quantifying -> Online graph data loading -> Visualization;

• Stable Layout Algorithm for Clustered Graph:– Preserve scaled user mental map: – Avoid cluster overlapping;

Page 7: IBM Confidential 1 HiMap: Adaptive Visualization of Large- Scale Online Social Networks Lei Shi, Nan Cao, Shixia Liu, Weihong Qian, Li Tan, Guodong Wang.

7IBM Confidential

Offline Graph Summarization

• Hierarchical graph clustering:– Clustering based on the graph topology;– Modify the modularity (Q value) based clustering algorithm;

• Modularity (Q value); (See backup slides)• Use relative Q as heuristic in the greedy algorithm;

– Iterative implementation for hierarchical clustering;

• Item ranking:– Pre-compute offline for nodes under the same parent node;– Coverage-based ranking algorithm: greedy algorithm to maximize

coverage set S+ ;

– Clustered betweenness centrality based ranking algorithm;

Page 8: IBM Confidential 1 HiMap: Adaptive Visualization of Large- Scale Online Social Networks Lei Shi, Nan Cao, Shixia Liu, Weihong Qian, Li Tan, Guodong Wang.

8IBM Confidential

Online Adaptive Data Loading• Visual item quantifying

– Graph visual density:

– Visual item quantifying: recursively allocate screen space to the hierarchical graph nodes/clusters, proportional to their weights (leaf node num) and sequential to the rank;

– Flexible control parameters;• Maximal Visualization Breadth;• Minimal Visualization Depth;• Maximal Visualization Depth;• Overview Depth;

• Adaptive summarization: load the graph data with the highest ranks within the computed quantity in each hierarchy

Page 9: IBM Confidential 1 HiMap: Adaptive Visualization of Large- Scale Online Social Networks Lei Shi, Nan Cao, Shixia Liu, Weihong Qian, Li Tan, Guodong Wang.

9IBM Confidential

Multi-Level Clustered Graph Layout• Recursive implementation of Kamada-

Kawai layout algorithm :– Bottom-up multi-level layout in the world

coordinate system;– Top-down projection in the screen

coordinate system;

layo

ut

proje

ction

Graphs at height 0

… …

Graphs at height n

Graphs at height n

… …

Graphs at height 0

GraphInput

World coordina

te

Screen Coordinate

Page 10: IBM Confidential 1 HiMap: Adaptive Visualization of Large- Scale Online Social Networks Lei Shi, Nan Cao, Shixia Liu, Weihong Qian, Li Tan, Guodong Wang.

10IBM Confidential

Stable Graph Layout Algorithm (1)• Stable graph layout algorithm in

[Boitmanis’07][Cao’08]:– Minimize the stress energy

• Preserving scaled mental map:– Minimize the stress energy

x

y

G0

readability term stability term

readability term stability term

x

y

G1'

G1''

G1

Page 11: IBM Confidential 1 HiMap: Adaptive Visualization of Large- Scale Online Social Networks Lei Shi, Nan Cao, Shixia Liu, Weihong Qian, Li Tan, Guodong Wang.

11IBM Confidential

Stable Graph Layout Algorithm (2)• Combined naturally with iterative

solvers of KK algorithm

• Optimized for clustered graph to avoid overlapping

IntersectI = G0∩G1

I==Ø? Layout G1 statically

Y

N

Transform nodes in I

Layout G1 by iterative solver

Refine scaling factor c

End iteration?

Input: 1) preceding graph G0 and its previous position P; 2) succeeding graph G1;

Output: the final node positions, used as the layout

of succeeding graph G1;

Y

N

Transform nodes in G1

Page 12: IBM Confidential 1 HiMap: Adaptive Visualization of Large- Scale Online Social Networks Lei Shi, Nan Cao, Shixia Liu, Weihong Qian, Li Tan, Guodong Wang.

12IBM Confidential

Customized User Interaction and Animation

• Graph Interaction Design– Traditional Zoom&Pan, highlight, drag:– Graph navigation through hierarchical and semantic zoom-

in/zoom-out;– Dynamic filtering/query;

• Customized animation to preserve user’s mental map in navigating the map– Benefited from the stable graph layout;

Page 13: IBM Confidential 1 HiMap: Adaptive Visualization of Large- Scale Online Social Networks Lei Shi, Nan Cao, Shixia Liu, Weihong Qian, Li Tan, Guodong Wang.

13IBM Confidential

Experimental Results: Adaptive Visualization

• Adaptive Visualization

– Compare graph visual density under adaptive visualization and previous metric-based graph filtering;

– Adaptive visualization achieves rigid graph visual density control;

Page 14: IBM Confidential 1 HiMap: Adaptive Visualization of Large- Scale Online Social Networks Lei Shi, Nan Cao, Shixia Liu, Weihong Qian, Li Tan, Guodong Wang.

14IBM Confidential

Experimental Results: Clustering Overlapping

• Cluster Overlapping

– Optimized KK layout algorithm greatly reduces the cluster overlapping probability;

Page 15: IBM Confidential 1 HiMap: Adaptive Visualization of Large- Scale Online Social Networks Lei Shi, Nan Cao, Shixia Liu, Weihong Qian, Li Tan, Guodong Wang.

15IBM Confidential

Experimental Results: Layout Stability

• Layout Stability– Both the readability

term and the stability term of the stress energy of our algorithm is significantly smaller than those of previous stable layout algorithm, which indicates our algorithm achieves both better readability and stability;

Page 16: IBM Confidential 1 HiMap: Adaptive Visualization of Large- Scale Online Social Networks Lei Shi, Nan Cao, Shixia Liu, Weihong Qian, Li Tan, Guodong Wang.

16IBM Confidential

Related Work• Visualization systems:

– SocialAction: Flexible complex network visualization and diagnosis;– Vizster: Navigation & interactive community detection;– Matrix Explorer and Nodetrix: Hybrid visualization;– Small-world graph visualization: Focus+context;

• Clustered graph visualization:– Feng’s PhD thesis on clustered graph drawing;– Eades and Huang’s pioneer work;– Multi-scale visualization by Auber et al.;

• Graph summarization:– Node abstraction and edge grouping by Six et al.;– Edge summarization and bundling;– OntoVis: graph summarization by Ontology;

• Graph layout:– Refer to the classic graph drawing books;

Page 17: IBM Confidential 1 HiMap: Adaptive Visualization of Large- Scale Online Social Networks Lei Shi, Nan Cao, Shixia Liu, Weihong Qian, Li Tan, Guodong Wang.

17IBM Confidential

Summary & Future Works

• HiMap is a visualization system that displays the hierarchical structure and relationships of large-scale social networks

• HiMap puts the first emphasis on eliminating the visual clutter naturally raised by the huge user base of state-of-the-art social networks

• We have developed the visual density based adaptive data loading technique and the optimized layout algorithm for clustered graph

• Experimental results demonstrate that HiMap is capable of rigidly controlling the visual density of graph view, while limiting the cluster overlap probability to rather low level and improving both the graph readability and stability

Page 18: IBM Confidential 1 HiMap: Adaptive Visualization of Large- Scale Online Social Networks Lei Shi, Nan Cao, Shixia Liu, Weihong Qian, Li Tan, Guodong Wang.

18IBM Confidential

Preliminary Demo

• Java-based desktop system• Dataset:

– Smallblue community (250k+ people)– Two departments of Tsinghua University on Xiaonei.com;– Academic social network over DBLP dataset;

• Extensions:– Web-based system;

– Flash-based system;

Page 19: IBM Confidential 1 HiMap: Adaptive Visualization of Large- Scale Online Social Networks Lei Shi, Nan Cao, Shixia Liu, Weihong Qian, Li Tan, Guodong Wang.

19IBM Confidential

Q & A

Page 20: IBM Confidential 1 HiMap: Adaptive Visualization of Large- Scale Online Social Networks Lei Shi, Nan Cao, Shixia Liu, Weihong Qian, Li Tan, Guodong Wang.

20IBM Confidential

Backup – Modularity-based Clustering• Definitions: given a graph with vertices (v1, .. vN) and edges,

for any partition (clustering) that divides the vertices into several groups, denote : the fraction of edges in the graph that connect vertices in

group i to those in group j

: the fraction of edges in the graph that connect to vertices in

group i

• Modularity of a partition (clustering)

• Find the best clustering by maximizing Q– Greedy algorithm: iterative binary clustering using Q as heuristics;

j

iji ea

)( 2i

iii aeQ

)(22 jiijjijiij aaeaaeeQ

ije