Introduction to Information Retrieval
CS 5604: Information Storage and Retrieval
ProjCINETViz
by
Maksudul Alam,
S M Arifuzzaman, and
Md Hasanuzzaman Bhuiyan
Introduction to Information Retrieval
Overview
Recap
Features
Demonstration
Technical Challenges
Future work
2
Introduction to Information Retrieval
Project Description
• Developed a visualization module
– Visualize graphs using Gephi
– Integrate this visualization module with CINET
• Supports large network graphs
3
Introduction to Information Retrieval
Gephi
• Java based visualization and exploration platform
• Interactive
• Visualize all kinds of networks
• Compatible with Windows, Linux and Mac OS X
• Open-source and free
4
Introduction to Information Retrieval
How to use Gephi?
• Stand-alone desktop application
• Java based Gephi Toolkit library
• We will use Gephi Toolkit library
5
Introduction to Information Retrieval
Network Representation
6
Introduction to Information Retrieval
Network Visualization
• Typical steps to visualize a network:
1. Layout
Random
Force Atlas
Yifan Hu’s
7
2. Feature based organization
Degree
Betweennesscentrality
Closeness centrality
Modularity
3. Visualization in Web Browser
Java Applet
Javascript
Flash
WebGL
Introduction to Information Retrieval
CINET
• Cyber-Infrastructure for NETwork Science
• Easy-to-use cyber-environment
• Provides computational and analytic environment for network analysis
• Developed in NDSSL lab
• Funded by NSF
8
Introduction to Information Retrieval
Integration of Visualization to CINET
User Interfaces
Research
Interface
Instructional
Interface
GaLib
Broker
NetworkX
Broker
CINet
Broker
Interface
Broker
JA
VA
AP
I
WS
AP
I
Blackboard
Digital Library
0 20 40 60 80 100 120
0.00
0.02
0.04
0.06
0.08
Day
Proportion Symptomatic
Autauga County
CountyState
0 20 40 60 80 100 120
0.00
0.02
0.04
0.06
0.08
Day
Proportion Symptomatic
Baldwin County
CountyState
0 20 40 60 80 100 120
0.00
0.02
0.04
0.06
0.08
Day
Proportion Symptomatic
Barbour County
CountyState
0 20 40 60 80 100 120
0.00
0.02
0.04
0.06
0.08
Day
Proportion Symptomatic
Bibb County
CountyState
0 20 40 60 80 100 120
0.00
0.02
0.04
0.06
0.08
Day
Proportion Symptomatic
Blount County
CountyState
0 20 40 60 80 100 120
0.00
0.02
0.04
0.06
0.08
Day
Proportion Symptomatic
Bullock County
CountyState
0 20 40 60 80 100 120
0.00
0.02
0.04
0.06
0.08
Day
Proportion Symptomatic
Butler County
CountyState
0 20 40 60 80 100 120
0.00
0.02
0.04
0.06
0.08
Day
Proportion Symptomatic
Calhoun County
CountyState
0 20 40 60 80 100 120
0.00
0.02
0.04
0.06
0.08
Day
Proportion Symptomatic
Chambers County
CountyState
0 20 40 60 80 100 120
0.00
0.02
0.04
0.06
0.08
Day
Proportion Symptomatic
Cherokee County
CountyState
0 20 40 60 80 100 120
0.00
0.02
0.04
0.06
0.08
Day
Proportion Symptomatic
Chilton County
CountyState
0 20 40 60 80 100 120
0.00
0.02
0.04
0.06
0.08
Day
Proportion Symptomatic
Choctaw County
CountyState
0 20 40 60 80 100 120
0.00
0.02
0.04
0.06
0.08
Day
Proportion Symptomatic
Clarke County
CountyState
0 20 40 60 80 100 120
0.00
0.02
0.04
0.06
0.08
Day
Proportion Symptomatic
Clay County
CountyState
0 20 40 60 80 100 120
0.00
0.02
0.04
0.06
0.08
Day
Proportion Symptomatic
Cleburne County
CountyState
ResultsNetworks
Measures and
Analysis
GaLib
NetworkX
EpiFast(upcoming)
Pajek(upcoming)
Execution
Broker
Resource
Broker
Compute ResourcesHPC Cluster Individual Sever
Batch
API
Model
Wrapper
Model
Wrapper
Batch
Interface
Digital Lib
Broker
DL
API
Viz. Interface
Preprocessed viz. data
9
Introduction to Information Retrieval
Typical Visualization Workflow
10CINETViz
GexfGeneration
Core
Layout Core
Network Analysis
Core
VisualizationCore
Web Rendering
Script
CINET Server
User Parameters
Generate Gexf from
CINET Graphs
Apply Layout
Network Analysis
Color, Size, Label
Process Data for Web Browser
Store Rendered Graph
Display Network in Web Browser
User
Introduction to Information Retrieval
CINETViz – Features
• Mimic the core functionalities of Gephi Desktop Application into web interface:
– Layout
– Ranking based on parameters
– Partitioning
• Dynamic range of visualization
– User can pick how the node color, size would vary and by how much
• Store rendered networks into organized structure
11
Introduction to Information Retrieval
CINETViz-DEMO
• Main Screen
• http://128.173.98.199:8082/granite
12
Introduction to Information Retrieval
CINETViz-DEMO
• Visualization integrated as a Tab into CINET interface
13
Introduction to Information Retrieval
CINETViz-DEMO
• User can visualize pre-rendered network or submit new network visualization.
14
Introduction to Information Retrieval
CINETViz-DEMO
• User can visualize pre-rendered network or submit new network visualization.
15
Introduction to Information Retrieval
CINETViz-DEMO
• To generate new network visualization user can pick a network and select appropriate visualization parameters
16
Introduction to Information Retrieval
Difficulties
• Graph format
– Diverse
– Conversion
• Data transfer from server to web app
– Latency, bandwidth, browser compatibility and support
• Integration with CINET
– Compatibility with existing architecture
– Issues with smart-gwt etc.
17
Introduction to Information Retrieval
CINETViz Implementation Challenges
• Study of CINET GRANITE framework
• Integration of visualization toolkit into web browser
– Communicate between GWT and sigmajs visualization library using native javascript
• Communication between web server and high performance cluster
• Implementation of visualization methods (coloring, sizing, layouting) using gephi-toolkit programmatically
18
Introduction to Information Retrieval
Visualizing Large Networks
• Large network if |V| >= 10,000 or |E| >= 50,000
• Choose a root node
– Randomly
– User defined
• Using BFS, explore from root up to:
– Pre-specified depth (i.e., 4 or 5)
– Pre-specified number of nodes (i.e., 200 nodes)
19
Introduction to Information Retrieval
Future Work
• Workflow
– Visualizing the output
• Providing more information
– Showing node label, id, edge weight and etc.
• Filtering
– Visualize small part of graph
• Graph organization by applying multiple algorithms
– For example, we want to apply both page rank and betweennesscentrality
• Comparison of the different visualization
– Using different measures
20
Introduction to Information Retrieval
Questions and Comments
21
Top Related