IBM Confidential 1 HiMap: Adaptive Visualization of Large- Scale Online Social Networks Lei Shi, Nan...
-
Upload
marshall-jordan -
Category
Documents
-
view
214 -
download
0
Transcript of IBM Confidential 1 HiMap: Adaptive Visualization of Large- Scale Online Social Networks Lei Shi, Nan...
1IBM Confidential
HiMap: Adaptive Visualization of Large-Scale Online Social Networks
Lei Shi, Nan Cao, Shixia Liu, Weihong Qian, Li Tan, Guodong Wang
Jimeng Sun and Ching-yung Lin
IBM Research
2IBM Confidential
Online Social Network is Prevailing…
• Commercial Online Social Networks– Worldwide: Facebook, MySpace, Wikipedia;– In China: Kaixin001.com, Xiaonei.com;– Vertical web 2.0 websites;
• Enterprise Social Networks– IBM Smallblue: A social networking application to build,
maintain and utilize the intra-company expert network;
3IBM Confidential
The Subsequent Huge Graph
• Large– Facebook: 200M+ nodes (user);– MySpace: An available graph with 8.8M+ edges (Friendship);– Smallblue: 25k+ nodes (employee), 140k+ edges (Rel.);
• Other Characteristics (Cited from IMC’07 best paper)– Power-law;– Small world;– Scale-free;– Densely connected core plus small clusters;
• Visualization is useful for both admin and end user– Overview;– Guided navigation;– Advanced analysis;
4IBM Confidential
Outline
• Problem• Solution
– Clustered graph visualization;– Leverage user interaction and navigation;
• Technical Details– Adaptive graph summarization and loading;– Stabilized clustered graph layout algorithm;– Customized user interactions and animations;– Experimental results;
• Related Works• Summary• A Preliminary Demo
5IBM Confidential
Problem• Problem -- How to visualize the huge graph to
satisfy:– Readability: Each graph view of the network should be visualized in a readable manner that is
easy to be comprehended, independent with its scale, topology and the screen size to display;
– Reachability: A suite of navigation methods should be provided so that it is capable to visualize and diagnose every detail of the network;
– Stability: Smooth animations should be presented between any view changes, so as to keep user’s momentum (mental map);
– Responsibility: The visualization system should run fast enough and keep lightweight: it could catch up with the animation speed and load social network data incrementally and on-demand;
• Technical Challenges (Trade-Offs):– Visualization complexity: Information volume V.S. Human perception
capacity (readability);
– Data loading: Memory usage V.S. Responsive time;
– Graph layout: Layout quality (readability) V.S. Responsive time;
– Layout stability: Graph readability V.S. Stability;
6IBM Confidential
Solutions: HiMap Overview• Clustered Graph Visualization:
– Social network processes highly clustered and self-similar community structure;
– Clustered structure enables semantic abstraction of large-scale graphs;
• Adaptive Graph Summarization and Loading:– Clustering -> Item ranking ->
Online item quantifying -> Online graph data loading -> Visualization;
• Stable Layout Algorithm for Clustered Graph:– Preserve scaled user mental map: – Avoid cluster overlapping;
7IBM Confidential
Offline Graph Summarization
• Hierarchical graph clustering:– Clustering based on the graph topology;– Modify the modularity (Q value) based clustering algorithm;
• Modularity (Q value); (See backup slides)• Use relative Q as heuristic in the greedy algorithm;
– Iterative implementation for hierarchical clustering;
• Item ranking:– Pre-compute offline for nodes under the same parent node;– Coverage-based ranking algorithm: greedy algorithm to maximize
coverage set S+ ;
– Clustered betweenness centrality based ranking algorithm;
8IBM Confidential
Online Adaptive Data Loading• Visual item quantifying
– Graph visual density:
– Visual item quantifying: recursively allocate screen space to the hierarchical graph nodes/clusters, proportional to their weights (leaf node num) and sequential to the rank;
– Flexible control parameters;• Maximal Visualization Breadth;• Minimal Visualization Depth;• Maximal Visualization Depth;• Overview Depth;
• Adaptive summarization: load the graph data with the highest ranks within the computed quantity in each hierarchy
9IBM Confidential
Multi-Level Clustered Graph Layout• Recursive implementation of Kamada-
Kawai layout algorithm :– Bottom-up multi-level layout in the world
coordinate system;– Top-down projection in the screen
coordinate system;
layo
ut
proje
ction
Graphs at height 0
… …
Graphs at height n
Graphs at height n
… …
Graphs at height 0
GraphInput
World coordina
te
Screen Coordinate
10IBM Confidential
Stable Graph Layout Algorithm (1)• Stable graph layout algorithm in
[Boitmanis’07][Cao’08]:– Minimize the stress energy
• Preserving scaled mental map:– Minimize the stress energy
x
y
G0
readability term stability term
readability term stability term
x
y
G1'
G1''
G1
11IBM Confidential
Stable Graph Layout Algorithm (2)• Combined naturally with iterative
solvers of KK algorithm
• Optimized for clustered graph to avoid overlapping
IntersectI = G0∩G1
I==Ø? Layout G1 statically
Y
N
Transform nodes in I
Layout G1 by iterative solver
Refine scaling factor c
End iteration?
Input: 1) preceding graph G0 and its previous position P; 2) succeeding graph G1;
Output: the final node positions, used as the layout
of succeeding graph G1;
Y
N
Transform nodes in G1
12IBM Confidential
Customized User Interaction and Animation
• Graph Interaction Design– Traditional Zoom&Pan, highlight, drag:– Graph navigation through hierarchical and semantic zoom-
in/zoom-out;– Dynamic filtering/query;
• Customized animation to preserve user’s mental map in navigating the map– Benefited from the stable graph layout;
13IBM Confidential
Experimental Results: Adaptive Visualization
• Adaptive Visualization
– Compare graph visual density under adaptive visualization and previous metric-based graph filtering;
– Adaptive visualization achieves rigid graph visual density control;
14IBM Confidential
Experimental Results: Clustering Overlapping
• Cluster Overlapping
– Optimized KK layout algorithm greatly reduces the cluster overlapping probability;
15IBM Confidential
Experimental Results: Layout Stability
• Layout Stability– Both the readability
term and the stability term of the stress energy of our algorithm is significantly smaller than those of previous stable layout algorithm, which indicates our algorithm achieves both better readability and stability;
16IBM Confidential
Related Work• Visualization systems:
– SocialAction: Flexible complex network visualization and diagnosis;– Vizster: Navigation & interactive community detection;– Matrix Explorer and Nodetrix: Hybrid visualization;– Small-world graph visualization: Focus+context;
• Clustered graph visualization:– Feng’s PhD thesis on clustered graph drawing;– Eades and Huang’s pioneer work;– Multi-scale visualization by Auber et al.;
• Graph summarization:– Node abstraction and edge grouping by Six et al.;– Edge summarization and bundling;– OntoVis: graph summarization by Ontology;
• Graph layout:– Refer to the classic graph drawing books;
17IBM Confidential
Summary & Future Works
• HiMap is a visualization system that displays the hierarchical structure and relationships of large-scale social networks
• HiMap puts the first emphasis on eliminating the visual clutter naturally raised by the huge user base of state-of-the-art social networks
• We have developed the visual density based adaptive data loading technique and the optimized layout algorithm for clustered graph
• Experimental results demonstrate that HiMap is capable of rigidly controlling the visual density of graph view, while limiting the cluster overlap probability to rather low level and improving both the graph readability and stability
18IBM Confidential
Preliminary Demo
• Java-based desktop system• Dataset:
– Smallblue community (250k+ people)– Two departments of Tsinghua University on Xiaonei.com;– Academic social network over DBLP dataset;
• Extensions:– Web-based system;
– Flash-based system;
19IBM Confidential
Q & A
20IBM Confidential
Backup – Modularity-based Clustering• Definitions: given a graph with vertices (v1, .. vN) and edges,
for any partition (clustering) that divides the vertices into several groups, denote : the fraction of edges in the graph that connect vertices in
group i to those in group j
: the fraction of edges in the graph that connect to vertices in
group i
• Modularity of a partition (clustering)
• Find the best clustering by maximizing Q– Greedy algorithm: iterative binary clustering using Q as heuristics;
j
iji ea
)( 2i
iii aeQ
)(22 jiijjijiij aaeaaeeQ
ije