Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis...

45
TEL-AVIV UNIVERSITY RAYMOND AND BEVERLY SACKLER FACULTY OF EXACT SCIENCES SCHOOL OF COMPUTER SCIENCE Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv University by Ala Stolpnik The research work for this thesis has been carried out at Tel-Aviv University under the supervision of Dr. Ariel Shamir and Prof. Daniel Cohen-Or October 2009

Transcript of Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis...

Page 1: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

TEL-AVIV UNIVERSITY RAYMOND AND BEVERLY SACKLER

FACULTY OF EXACT SCIENCES SCHOOL OF COMPUTER SCIENCE

Visual Hints for Semantic Graph Exploration

Thesis submitted in partial fulfillment of the requirements

for the M.Sc. degree of Tel-Aviv University by Ala Stolpnik

The research work for this thesis has been carried out at

Tel-Aviv University under the supervision of Dr. Ariel Shamir

and Prof. Daniel Cohen-Or

October 2009

Page 2: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Abstract Semantic-graphs contain typed nodes and typed links between nodes. They impose greater demands on a visualization system as they can represent more than one graph and contain a great amount of information as attributes of graph entities. Semantic graphs need stronger tools that combine statistical and topological analysis and provide the link to the correct information context whenever possible. In this work we present visual hints as a method to reveal semantic information and assist both navigation and exploration of semantic graphs. Visual hints are defined by specific queries on the elements of the graph or their data. We define three types of visual hints: topological, statistical and contextual and show how these are used effectively in an interactive graph visualization system for various tasks.

Page 3: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Contents INTRODUCTION .................................................................................................................................. 5

DEFINITIONS ......................................................................................................................................... 7 PREVIOUS WORK ............................................................................................................................... 8 SEMANTIC GRAPH VISUALIZATION .......................................................................................... 10

INTERACTIVE VISUALIZATION ENHANCEMENTS ................................................................................. 10 Aggregation .................................................................................................................................. 10

Entities Aggregation ................................................................................................................................ 10 Relations Aggregation .............................................................................................................................. 11 Hybrid Aggregation ................................................................................................................................. 11

Navigation ..................................................................................................................................... 12 Filtering ........................................................................................................................................ 13

Semantic Relevance ................................................................................................................................. 14 Semantic Distance .................................................................................................................................... 14 Example ................................................................................................................................................... 14

Naïve Exploration: .............................................................................................................................. 14 Semantic Exploration .......................................................................................................................... 15

Depiction ....................................................................................................................................... 18 Layout ...................................................................................................................................................... 18 Rendering ................................................................................................................................................. 19

VISUAL HINTS .................................................................................................................................... 24 Topological Hints.......................................................................................................................... 24

Peek-in Hints ............................................................................................................................................ 25 External Links Hint .................................................................................................................................. 27 Direct-links Hint ...................................................................................................................................... 28 Connectivity Hint ..................................................................................................................................... 28 Semantic Filtering .................................................................................................................................... 29

Statistical Hints ............................................................................................................................. 29 Attribute Hint ........................................................................................................................................... 29 Distribution Hint ...................................................................................................................................... 30 Distance Hint ........................................................................................................................................... 31 Node Degree Hint .................................................................................................................................... 34 Semantic Filtering .................................................................................................................................... 35

Contextual Hints ........................................................................................................................... 36 Time-line Hint .......................................................................................................................................... 36 Positional Hint ......................................................................................................................................... 36

IMPLEMENTATION .......................................................................................................................... 38 DATA SOURCES .................................................................................................................................. 39 CASE STUDIES .................................................................................................................................... 39 USER EXPERIENCES ............................................................................................................................ 40

CONCLUSIONS ................................................................................................................................... 42

Page 4: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

List of Figures Figure 1: Semantic graph without aggregation ___________________________________________ 6 Figure 2: Hierarchical entities grouping example ________________________________________ 11 Figure 3: Different edge aggregation modes. From left to right: (a) No aggregation. (b) All edges aggregated. (c) Directional aggregation, (d) Typed aggregation _____________________________ 11 Figure 4: Transitive filter ___________________________________________________________ 12 Figure 5: Zoom in example __________________________________________________________ 12 Figure 5: Zoom out example _________________________________________________________ 12 Figure 7: Simple filter ______________________________________________________________ 13 Figure 8: Naive exploration - step 0 ___________________________________________________ 14 Figure 9: Naive exploration - step 1 ___________________________________________________ 15 Figure 10: Naive exploration - step 2 __________________________________________________ 15 Figure 11: Naive exploration - step 0 __________________________________________________ 16 Figure 10: Naive exploration - step 1 __________________________________________________ 16 Figure 11: Naive exploration - step 2 __________________________________________________ 16 Figure 12: Naive exploration - step 3 __________________________________________________ 17 Figure 13: Node Type Cluster Layout. _________________________________________________ 18 Figure 14: Edge type aware layout. ___________________________________________________ 18 Figure 15:Displaying entties grouped by type ___________________________________________ 19 Figure 16: Aggregated view. The Architectural_style, Religion and Location groups are displayed using the aggregate representation. ___________________________________________________ 20 Figure 17: Minimal representation ____________________________________________________ 20 Figure 18: Compact representation ___________________________________________________ 21 Figure 19: Normal representation ____________________________________________________ 21 Figure 20: Detailed representation ____________________________________________________ 22 Figure 21: The color of the edge indicates the type of the edge ______________________________ 22 Figure 22: The width of the edges is proportional to the number of the relations they represent ____ 23 Figure 23: The label of the edge can indicate the type(s) of the relations represented by the edge ___ 23 Figure 24: An aggregation operation (a), and various topological hints: (b) Peek-in Hint, (c) External-link Hint, (d) Direct-link ____________________________________________________________ 24 Figure 25: Peek-into the Music Genres node reveals the contained genres in more details. ________ 26 Figure 26: Peeking into an edge connecting “Bauhaous” and “Person” reveals all architects associated with this style ____________________________________________________________ 27 Figure 27: Peek-in hint filtered by user selection ________________________________________ 28 Figure 28: Connectivity hint between two nodes reveals the real path between them. _____________ 29 Figure 29: Entity types distribution hint ________________________________________________ 30 Figure 30: Link types distribution hint _________________________________________________ 31 Figure 31: Year attribute distribution. The different works aggregated in the "Work" node are distributed according to their year attribute. ____________________________________________ 31 Figure 32: Graph distance hint _______________________________________________________ 32 Figure 33: Graph distance vs. Semantic distance. ________________________________________ 33 Figure 34: Distance by type hint. In this example it is possible to see that, for example, most of the entities at distance 4 from Madonna are from the type "Concept". ____________________________ 34 Figure 35: Node Degree hint. ________________________________________________________ 35 Figure 36: Filtering of Node Degree hint. The user is able to decide which entity types should be included in the statistical hint. ________________________________________________________ 36 Figure 37: Timeline hint showing the distribution of structures on the time line _________________ 36 Figure 38: Positional hint for structured designed by Gaudi ________________________________ 37 Figure 39: Architects case study ______________________________________________________ 40 Figure 40: Muscians case study ______________________________________________________ 40

Page 5: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Chapter 1 Introduction Graph visualization allows domain experts as well as novices to interactively navigate and explore graphs for various analysis tasks such as identifying interesting nodes, discovering patterns of links, detecting relationships, and identifying subgroups [Bor09a, Sof09]. In addition, many analysis tools rely on statistical methods measuring distances, degrees and links in the graph [BW03, dNMB05, Bor09b]. A semantic-graph (also known as relational data graph or heterogeneous graph) contains typed nodes and typed edges between nodes. Each element (node or edge) may have a set of attributes associated with it and links of various types could connect various nodes. The power and expressiveness of these graphs lies not only in their structure but also in the semantic information that they carry. Semantic graphs impose greater demands on visualization and analysis systems. In fact, a semantic graph can represent more than one graph depending on the type of nodes chosen and the type of links used to connect them, which are often multitude. A large amount of research in graph visualization has been devoted to revealing important structures and allowing interactive navigation [HMM00, PS06a]. However, visually examining just the structure of a semantic graph is often not enough. In many studies the interrelation between structural properties of the graph and the attributes of the entities is crucial. The information present in the data as attributes of graph entities must be revealed and presented visually to enable true understanding and insight. Therefore, the complexity of semantic graphs calls for stronger tools that combine statistical and topological analysis and provide the links to the correct information context when possible. Towards this end we propose to use visual hints that can reveal various semantic layers of such graphs in a direct manner during graph exploration. Mapping of values to visual attributes to depict semantic information has been used for a long time. For instance, the use of color, length and size of the visual elements is abundant. However, these attributes usually allow only a simple mapping of scalar values to visual depictions (e.g. larger nodes represent larger sub-graphs, color represents type). Our visual hints aim at depicting more complex semantic information including non-scalar attributes, more complex statistics, and temporal and spatial relations. Moreover, due to the limitation of the human cognitive system, any graph visualization containing more than a few hundreds of nodes and edges, may result in an incomprehensible depiction with many occlusions and overlaps [HMM00, FT09]. This makes differentiation and closer investigation of individual nodes and links almost impossible. Common techniques that address this problem are aggregation and filtering. We use aggregation to mean any form of clustering or merging of several graph elements in a hierarchical manner, hence reducing the number of displayed elements. Filtering also reduces the displayed elements of the graph but does so by totally removing parts of the graph from view. Both these techniques reduce the clutter but also hide much information from the user. In semantic graphs this problem

Page 6: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

becomes even more severe since the network contains many different types of nodes and links and the use of filtering and aggregation becomes a necessity (see Figure 1). By using visual hints the user can regain access not only to semantic information but to hidden or aggregated parts of the graph to assist intelligent decision making and navigation.

Figure 1: Semantic graph without aggregation Visual hints can be invoked on nodes or edges of the current graph depiction. They are defined by specific queries on the elements of the graph or their data. We define three types of visual hints: topological, statistical and contextual, whose results are sub-graphs, charts, and contextual plots (e.g. a map or time-line) respectively. The key idea is that a hint displays the result of these queries in a succinct manner on top of the main graph view, and therefore does not disrupt the main view and the user’s mental map. However, they are invoked directly on graph elements, and therefore displayed in context (e.g. next to the query element) providing direct and natural utilization during navigation. Several hints can be displayed at once, allowing more complex decision making based on comparisons. We present several examples of defining such visual hints and show how these can assist navigation and exploration of heterogeneous data graphs.

Page 7: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Definitions • Entity – a data element that has attributes and can be related to other entities.

For example: a person or an organization. Each entity usually has a type attribute.

• Relation – a connection between a pair of entities. For example person A is "brother of" person B.

• Graph - an abstract representation of a set of objects where some pairs of the objects are connected by links. The objects are called “Nodes” and the links are called “Edges”

• Node – element in the graph. A node can represent an entity or a group of entities.

• Edge – element in the graph. An edge can represent a relation or a group of relations between the entities represented by the edge nodes.

• Semantic Graph – a graph that contains nodes and edges from different types. It is created from data set containing entities and relations from different types.

Page 8: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Chapter 2 Previous Work Semantic graphs are becoming a standard to describe complex data. The node and link types are often related through an ontology graph also known as a schema. A standard way to represent semantic graphs is the RDF (Resource Description Framework) format [RDF03, FTjH05]. In RDF representation each node in the graph is an object represented by a uniform resource identifier. Each node can be connected to other nodes by edges or to literals, which represent attributes. Large data sets today are usually arranged hierarchically and the user has the ability to navigate the hierarchy and decide the desired level of details. Many works have addressed the problem of hierarchical clustering and visualization of hierarchical graphs [AK02, EH00, BD07, PS06a]. The power of semantic graphs is that the clustering may be performed automatically, based on the schema of the data [THP08, FTjH05], therefore often entities that are aggregated together share similar semantics, for example belong to the same type. Visually revealing the semantics assists in understanding the displayed information and allows better exploration of the data. Visualizing the data as hierarchical graph is useful when the user task is to understand the global structure of the data, however when the user is interested in the detailed data, such as the attributes of specific entities and relationships between them, the overall representation of the data is not adequate [DKS07]. In those cases it is better to begin from a point of interest and then incrementally explore more of the graph. During the exploration the user is required to decide where the exploration should continue. Without adequate visual hints this task is difficult, and decisions may cause unnecessary clutter and confusion. When a user is investigating a large aggregated data set, it is often beneficial to switch between the summarized overview and the detailed information. The user is obliged to mentally combine the provided overview and the detailed views, and this can cause her to lose context. Previous approaches that deal with this problem use Fisheye zoom and lenses. Fisheye views [Fur86] were originally developed as a way of balancing local detail and global context in the interface, based on how humans conceptually structure and manage large collections of information. Lenses [TAvHS06] enable an instant zoom into a small portion of the graph in order to see the detailed sub-graph while still preserving the context. We use similar ideas of providing more details locally in various topological hints for hierarchically clustered graphs. Interactive visualization of a graph contains three parts: navigation, filtering and depiction of the currently visible data. Each of the above parts may be enhanced by usage of the semantics of the data. Semantic graph contains entities and relations from different types. Some types are more relevant for the task performed by the analyst then others. There are methods [BCE05] to evaluate the relevance of individual links and nodes in the semantic graph for detecting relationships. Navigation and filtering of the semantic graph may be enhanced using this information.

Page 9: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Many works have addressed visual cues for depiction. Some use color, shape and size to distinguish between nodes and edges with different attributes [FTjH05, PS06a]. Others aggregate and layout the nodes according to similar semantics [SF08, HHG09, CPC09]. More sophisticated approaches calculate the relevancy of each data item and emphasize it visually [Shn96, JP02, JP05]. However, most of those methods only reveal scalar information while we target depictions of more complex information. Filtering according to the semantics of the data was also addressed in previous work. Nodes and edges are filtered according to their types [PS06a, SMER06] or some statistical metric [PS08]. Visual distinction between nodes and edges with different characteristics serves as visual hint for filtering. Recently other works have promoted integrating statistics with interactive graph visualization [PS06a, PS08]. Our statistical hints follow similar lines and offers graph statistics as well as semantic data statistics. We also provide a more focused analysis to assist local navigation decisions and define other hints that tightly integrate correct semantic context (maps, timelines). Visual hints are also related to scented widgets [WHA07] that are used as navigation hints for low-level user-interface widgets such as lists, sliders and checkboxes. However, we depict more complex queries and concentrate on high-level interactive graph navigation.

Page 10: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Chapter 3 Semantic Graph Visualization Interactive visualization of a semantic graph contains three parts: navigation, filtering and depiction of the currently visible data. For completeness and to allow better understanding of the visual hints we present, we will describe first the framework for semantic graph visualization, enhanced by the usage of the semantics. Towards this end we propose to use visual hints that can reveal various semantic layers of such graphs to assist in the exploration.

Interactive Visualization Enhancements

Aggregation When the data is large and it is not possible to display all the entities on the screen at once, the data must be hierarchically grouped to enable an overview and navigation.

Entities Aggregation Entities hierarchy can be defined based on the semantics of the data. The hierarchy can be defined automatically, according to the schema describing the data, or with the help of the user. We propose here few methods for entities grouping:

• Entities that have the same type. Example: all the people belong to one group and all the organizations to another.

• All the entities that are related to an entity selected by the user by the same relation type. Example: Let's say that the user selected entity called Moshe. All the brothers of Moshe belong to the group called "Brothers of Moshe". All the colleagues of Moshe belong to the group called "Colleagues of Moshe".

• The user can define a property according to which the entities should be grouped. Example: all the people that live in the same country will belong to the same group.

• The user can select specific entities from the data and decide that they belong to the same group.

The final grouping structure can be any combination of the above parameters. The only constrain is that it must be hierarchical. Example:

Page 11: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Figure 2: Hierarchical entities grouping example

Relations Aggregation If there is more than one relation connecting a pair of nodes, for example if the nodes represent entities aggregation, there are few ways to draw them. The aggregation of the relations can be defined according to the types of the relations. No Aggregation – Each relation is represented by a single edge and is drawn separately. All Aggregated – All the relations are aggregated into a single edge. Directional Aggregation – All the relations in the same direction are aggregated into a single edge Typed Aggregation – All the edges from the same type and in the same direction are aggregated into a single edge.

Figure 3: Different edge aggregation modes. From left to right: (a) No aggregation. (b) All edges aggregated. (c) Directional aggregation, (d) Typed aggregation

Hybrid Aggregation More generic case of aggregation is when a single graph element (node or edge) aggregates a sub-graph containing both, entities and relations. Transitive Filtering is an example for such aggregation. During Transitive Filtering the filtered nodes are removed from the screen, but all the links that are connected to them are translated to direct links. For example: Person A works in the Company B, which is located in the Country C. Therefore there are links in the graph: A->B, B->C. When the Company B node is transitively filtered the two links are translated to a single link A->C. It is possible to define, using the semantics of the data, which nodes should be filtered out transitively. Transitive filtering enables

All entities

People Organizations Dates

Moshe Moshe Brothers

Moshe Colleagues

Colleagues in Israel

Colleagues in US

Organization 1

Organization 2

Moshe Birthday

Page 12: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

decreasing the number of the displayed nodes without losing the connectivity information. Although the node B is not displayed after the filtering the knowledge about the connectivity in the graph that it contains is still visible.

Figure 4: Transitive filter

Navigation Once the graph is hierarchically, the user should be able to navigate in the graph to select the desired resolution. Zoom in – zooming into a node replaces the selected node with the nodes it contains.

Figure 5: Zoom in example Zoom out – replaces all the nodes under the selected node with the selected node.

Figure 6: Zoom out example

Page 13: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Filtering When the data is too large to be displayed on the screen at once filtering is a well known technique for clutter reduction. Filtering is a process of the decision about which entities and relations are displayed and which are hidden. Filtering out irrelevant nodes and edges helps the analyst to concentrate on the important parts of the graph. In semantic graphs filtering may be applied according to node or edge attributes [PS06a, SMER06].

Figure 7: Simple filter At every moment the user monitors a visualization of a subset of the whole data, therefore methods for exploration of the hidden information are required. Hidden entities may be found by either searching for them in the data, or by revealing the entities connected to the already displayed ones. There might be many neighbor nodes, therefore usage of the semantics may reduce clutter and help the user see only the relevant information. Selection of the entities to reveal may depend on the following parameters: Node types – For example if the selected node is a person the user might want to see the other people and organizations that are related to the selected person, but not the countries he visited in. Relation types – For example at the specific moment the user might be interested in family connections but not in business connections between the selected person and other people. Semantic Weights – "Semantic Weights" enable the analyst to explore the data taking into account the semantics, which enables to focus on the data that is relevant to the investigated topic. Semantic weights are defined as a variation of Fish Eye model, by defining Degree of Interest (DOI) function that depends on the semantics of the explored data. DOI assigns to each point in the structure, a number telling how interested the user is in seeing that point, given the current task. Generalized fisheye views arise by decomposing the DOI into two components: a priori importance and distance. In its simplest, additive form the generalized fisheye Degree of Interest function is: fisheyeDOI (x|.=y) = API(x) – D(x,y), where fisheyeDOI is, according to the fisheye model, the user's Degree of Interest in a point, x, given that the current point of focus is y, API(x) is the global A Priori Importance of x.

Page 14: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

D(x,y) is the Distance between x and the current point y. That is, the interest increases with a priori importance and decreases with distance. We have to define the two components of the DOI function:

• API(x) is the global A Priori Importance of x, where x is an entity in the data. This is defined as "Semantic Relevance" of the entity x to the user's domain of interest.

• D(x,y) is the distance between an entity x and the currently selected node y. This is defined as "Semantic Distance" of the entity x to the node y.

Semantic Relevance "Semantic Relevance" is the relevance of the entity x to the user's domain of interest. Semantic Relevance can depend on:

• Type of the entity. Example: if the area of interest is terror entities from the type "Plane" are more relevant than entities from the type "Electronic Device"

• Properties of the node. Example: if the area of interest is terror and the node represents a person from Iran, it is more interesting than a node that represents a person from Norway.

Semantic relevance of each node type can be either manually defined by the user or can be automatically calculated using methods from [BCE05].

Semantic Distance Semantic distance is a distance between an entity and the selected node. Semantic distance depends on:

• Number of paths connecting the entity to the selected node • Length of paths connecting the entity to the selected node • Link types on the paths. Semantic relevance of each link type can be either

manually defined by the user or can be automatically calculated using methods from [BCE05].

Example The task is to investigate the relations of Yasser Arafat with Bashar Al-Assad. The data available is MindSwap [MIN05], a Semantic Web Terrorism Knowledge Base. Trying to explore the data naively until the relationships are revealed will cause the following scenario:

Naïve Exploration: Step 0 – only the entities in the core of the interest are displayed:

Figure 8: Naive exploration - step 0

Page 15: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Step 1 – Reveal the direct relations of the displayed data. 16 new nodes are displayed, but no connectivity is discovered.

Figure 9: Naive exploration - step 1 Step 2 – Reveal the direct relationships of the newly added nodes. The connection between Arafat and Bashar is visualized but it is very difficult to see it because of the clutter. It is also difficult to understand the context of the nodes on the path because of the clutter.

Figure 10: Naive exploration - step 2

Semantic Exploration Step 0 – only the entities in the core of the interest are displayed:

Page 16: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Figure 11: Naive exploration - step 0 Step 1 – semantic distance to be displayed is increased:

Figure 12: Naive exploration - step 1 Step 2:

Figure 13: Naive exploration - step 2 Step 3 – the connection between Bashar and Arafat is revealed:

Page 17: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Figure 14: Naive exploration - step 3

Page 18: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Depiction At every visualization stage the current graph is displayed to the user. The depiction of the graph may be also enhanced by the usage of the semantics hidden in the data.

Layout Layout is a process of placing the nodes on the screen. Layout that is aware of node types and edges types assists the user in understanding of the semantics of the data. In [SF08] they proposed a method for considering node properties by using Magnets. The method we propose is more flexible. For this purpose new layouts are presented: Node Type Clustered Layout – all the nodes of the same type are grouped to a single region. This layout enables the analyst to distinguish between different types of nodes. It also assists in investigation of relations between entities of different types. In [PS06a] they also use such nodes aggregation, but the aggregate nodes according to SNA (Social Network Analysis) metrics line betweenness without considering semantics of the data.

Figure 15: Node Type Cluster Layout.

Edge Type Aware Layout - force-directed layout in which the user is be able to specify spring constants (spring length and strength) for each link type separately. This way the user will have the control over the layout. He will determine which links will move the connected nodes closer, which farther and which won't influent the layout at all. Edge type aware layout enables visual clustering of the nodes according to their semantic information, emphasizing different aspects of the data.

Figure 16: Edge type aware layout. Node type aware layout – force directed layout in which the user is able to define attraction and repletion forces per node type. This layout enables clustering of the nodes according to different characteristics of the data.

Page 19: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Rendering Rendering if the process of drawing the graph nodes and edges on the screen. Rendering determines how each node and each edge is displayed. This section describes how usage of data semantics can utilize the rendering of a Semantic Graph. Each entity and relation has a type and may have attributes. The entities are displayed as nodes and relations as edges in the Semantic Graph. The type and the attributes of the entity can affect its visual representation and therefore assist in the understanding of the data. Group node – When the graph is hierarchically clustered, some nodes represent groups of entities. All the relations of the entities in the group are aggregated into relations of the group. The size of the node is used as indication for the number of entities the group contains. This compact presentation enables to examine the global structure of the graph.

Figure 17:Displaying entties grouped by type Aggregates entities representation – A group of entities is not necessarily represented by a single node. In the aggregate representation a group is represented by a region on the screen and all the entities as small dots within the region. This representation enables to examine the relations of the specific entities within the group without causing too much clutter.

Page 20: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Figure 18: Aggregated view. The Architectural_style, Religion and Location groups are displayed using the aggregate representation. Minimal nodes representation – The entity is displayed as a small point. The color of the node indicates the type of the entity. Tooltip on a node shows the name of the entity it represents.

Figure 19: Minimal representation Compact nodes representation – Only the icon representing the type of the entity is displayed. Tooltip on a node shows the name of the entity it represents.

Page 21: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Figure 20: Compact representation Normal nodes representation – The entity is displayed clearly with the label indicating its name and the icon representing its type.

Figure 21: Normal representation Detailed nodes representation – The node displays the label, the icon and the properties of the entity. The full representation depends on the type of the entity. For example if the entity is a person his personal details and a picture can be displayed.

Page 22: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Figure 22: Detailed representation Edge color – The color of the edge indicates the type of the edge.

Figure 23: The color of the edge indicates the type of the edge Edge width – If the edge is aggregated, its width will be proportional to the number sub-edges it aggregates.

Page 23: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Figure 24: The width of the edges is proportional to the number of the relations they represent Edge label – the label of the edge can indicate the type(s) of the relations represented by the edge.

Figure 25: The label of the edge can indicate the type(s) of the relations represented by the edge

Page 24: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Visual Hints Visual hints assist in the navigation and exploration of semantic graph.

Topological Hints Topological hints are useful not only on semantic graphs but also on simple graphs that contain no typed nodes or edges. Hence, for simplicity we will define topological hints on a simple undirected graph },{ originoriginorigin EVG = . An extension to directed graphs is straight-forward, and we extend them to semantic graphs at the end of the section. In most cases a graph is too large to be displayed directly and therefore visualization systems use some aggregation and filtering operations to arrive at a summarized graph

},{ EVG = . The summarized graph display is usually defined by a cut in a tree representing the aggregation operations along with some filtering of graph parts [HMM00].

Figure 26: An aggregation operation (a), and various topological hints: (b) Peek-in Hint, (c) External-link Hint, (d) Direct-link We assume that the summarized graph G is displayed in the current view. Observing G, the user may sometimes want to inspect a specific node or edge, or drill down the hierarchy without modifying the whole graph layout and without losing the current visual context. Topological hints provide the means to look into and examine the relationships between nodes and graph sub-structures while still preserving the global structure and thus, the user’s mental map. We define a topological hint as a sub-graph which is the result of a specific topological query on an element (node or edge), or group of elements of the current graph. We concentrate w.l.g. on one aggregation operation and denote by },{ +++ = EVG the more detailed graph before applying the operation and arriving at G. For aggregation of nodes we have that +⊂∃∈∀ VSVv , such that }{)\( vSVV U+= . We say that v is the representative of S in G (see example in Figure 26(a)). Note that S could also be

}{v , in which case no aggregation operation was applied. All edges connected to nodes in S must also be aggregated as follows. Define

},|),{( SwuEwuEs ∈∈= + represents all edges that are removed totally from the graph display when S is replaced by v (edges (h,i) and (j,k) in Figure 26(a)). Define

)}\(,|),{()( SVwSuEvuE S++

Δ ∈∈∈= . )(SEΔ represents all edges that link nodes in S to their neighbors and are replaced by new edges in G (e.g. edges (g,a) and (j,d) in the Figure 26). The new edges connect the new representative node v to the neighbors (all )\( SVw +∈ and are represented by

Page 25: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

},),(|),{( )( SuEwuEwvE SV ∈∈∈= Δ (e.g. edges (v,a) and (v,d)).We can now write

the relation between all edge groups as follows: vSS EEEEE UU ))(\( )(Δ+= . Note

that the same edge can appear several times during the process of aggregation. For instance if have ),( 111 uve = , ),( 222 uve = and then we cluster 1v and 2v to v, 1u and

2u to u, then in the aggregated graph we will have the same edge ),( uv representing both 1e and 2e . Hence, we can have composite edges that represent several links between different nodes (in a semantic graph these links can also be of different types). We present several specific topological hints based queries on elements in G. However, the vocabulary of topological hints could be extended easily by defining other topological queries.

Peek-in Hints There are times when a quick snippet on a small portion of an aggregated graph is useful. When observing the summarized graph the user may need to inspect a specific component without modifying the whole graph and without losing the context. This type of hint allows to “peek” into lower levels of the hierarchy in an aggregated graph display very similar to the use of lenses in other works [Fur86, TAvHS06]. Assume a node Vv∈ is the representative of +⊂VS then a peek-in hint on v displays the sub graph },{)( SvPeek ESG = (see Figure 26(b), Figure 27). Similarly, when we

choose a set }{ jvT = where Vv j ∈ representatives of +⊂VS j , then a peek-in hint on T will be },{)( TTTPeek EVG = such that jjT SV U= and

},|),{( TT VyxEyxE ∈∈= + . Hence, )(TPeekG will include all edges in )( jvPeekG , but

also edges linking the nodes across each )( jvPeekG .

Page 26: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Figure 27: Peek-into the Music Genres node reveals the contained genres in more details. Assume an edge Ee∈ connects two nodes Vvv ∈21 , which are representatives of

21 , SS from +G . A peek-in hint on e displays the sub graph },{)( eeePeek EVG = , where

},|),{( 21 SwSuwuEe ∈∈= and }|{ ee EvVvV ∈∈= + (see Figure 28).

Page 27: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Figure 28: Peeking into an edge connecting “Bauhaous” and “Person” reveals all architects associated with this style

External Links Hint External-links hint purpose is to supply a snippet to the outside connections of a set

+⊂VS represented by a node Vv∈ . For instance, since all the edges in )(SEΔ are aggregated and connected to v it is not clear whether paths through v are indeed connected or not. For instance, in Figure 26(a) nodes f and b seem connected through v. An external-links hint on v supplies a quick overview of the nodes inside S that are linked to nodes outside S in the current graph by displaying the paths that pass through v. Let us denote all neighbors of S in +V as }),(,|{)( )(SEwuSuVwSN Δ

+ ∈∈∃∈= . We also denote by extS all nodes in S that have an external edge

}),(),(|{ )(Sext EwuSNwSuS Δ∈∈∃∈= . The simplest external-links hint on v, the representative of S, displays the graph containing only the nodes in S linked to the outside, along with their neighbors and links: }),({ )()( Sextvext ESNSG Δ= U (Figure 26(c)). The user may select a subset )(SNX ⊂ and filter the subgraph )(vextG to contain only nodes and edges linked to nodes in X. The following example (Figure 29) demonstrates external-link hint on a subset of nodes selected by the user. Assume that the user is investigating relations between different pop artists. In this example it is possible to see from the aggregated graph that several different artists are related to some bands. However, it is impossible to understand whether some of the inspected artists belong to the same band or not. Expanding the “Bands” node would cause too much clutter (there are 48 entities contained in it). In contrast, an external-link hint on the “Bands” node with selected artists clears the picture. Using this hint the user can easily discover for example, that Mariah Carey, Kevin Federline and Akon all belong to the same band.

Page 28: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Figure 29: Peek-in hint filtered by user selection

Direct-links Hint Direct-link hint gives an even a more refined view of the paths passing through the representative node v. A direct-link hint displays the sub-graph containing nodes in S that are linked to two or more nodes in N(S), hence providing direct links between those nodes. We define }),(),,(),(,|{ )(21212 Sext EwuwuSNwwSuS Δ∈∈∃∈= and

)}(,|),{( 22 SNwSuEwuE extS ∈∈∈= + then a direct-link hint displays the sub-graph },{ 22)( sextvDir ESG = (Figure 26 (d)). Similar to external-link hints, N(S) could be

replaced by a subset )(SNX ⊂ and )(vDirG filtered to contain only nodes and edges linked to nodes in X.

Connectivity Hint Connectivity hint displays a sub-graph containing all the entities and links on the shortest path connecting selected nodes in the original graph G. This hint enables the user to discover whether and how the selected entities are related. If there is no path connecting the nodes, then only the original nodes are displayed. A connectivity hint can be defined on two or more nodes. In the following example (Figure 30), on the aggregated graph it seems like the path between Arafat and Asad is of length 2, since they are both connected to the “Person” link. The hint reveals the real paths of length 4 passes through organizations and events.

Page 29: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Figure 30: Connectivity hint between two nodes reveals the real path between them.

Semantic Filtering Topological hints can be defined on directed graphs by modifying the definition of queries to use directed edges instead of undirected ones. However, semantic graphs add an additional dimension to all topological queries. To concentrate on specific types of nodes or edges we allow filtering the results of any topological query using the node and edges types. This means that the resulting sub-graph is filtered through additional constraints on the graph elements that can reduce its clutter and focus the results. Such connections between graph semantics and graph topology is used further in the next sections

Statistical Hints We have discussed mapping of discrete or scalar values to visual depiction, for example, the number of nodes in a sub-graph or the type of nodes and edges. Other graph characteristics such as node-degree or graph-distances can determine important or interesting nodes such as central nodes, hubs and authorities. However, in many cases the distribution of values and not just the maximal or minimal instances can provide more information. In general, statistical analysis is a powerful tool to gain insight in semantic graphs and graphs in general [dNMB05, BW03]. Towards this end we define several types of statistical hints that can display sequences of values, distributions, and more complex graph characteristic queries.

Attribute Hint The simplest statistical hint can depict an attribute of an entity. The semantics information itself of an entity in a graph can be non-discrete or a non-scalar value.

Page 30: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

In this case it can not be represented by using color coding, size, or other basic depictions. An attribute hint can provide the correct way of visualizing attributes such as a list of values (temperature, position, phone numbers).

Distribution Hint An aggregated element (node or edge) represents a group of elements of various types that have various attributes. Presenting just the number of aggregated elements loses this important information. We define a distribution-hint to depict the distribution of nodes or edges in a sub-graph according to their semantic types or some other semantic attribute. This enables analysis of the semantic information hidden in aggregate nodes and edges without the need to open and investigate the sub-graph. The results are more succinct depictions that do not alter the global context of the summarized graph. There are several ways of displaying distributions. In our system the user can choose either histograms or pie-charts to display distributions. Entity types hint (Figure 31) displays the types of the entities in the inspected group. This group can be a node or can also contain a sub-graph. Using entity type hint, the user can understand how many entities from each type this group contains, and decide whether it is interesting, should be expanded or possibly filtered out.

Figure 31: Entity types distribution hint Similarly, link types hint (Figure 32) displays the distribution of link types in an inspected edge (this is different than the tool-tip on edges which displays just the list of types). The hint is very useful when the aggregated edge contains many links and peek-in hint displays too much information.

Page 31: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Figure 32: Link types distribution hint Distributions of other values could also be used such as the distribution of some attribute of all nodes in a group (see example in Figure 33).

Figure 33: Year attribute distribution. The different works aggregated in the "Work" node are distributed according to their year attribute.

Distance Hint Graph exploration often involves control over the level of details and filtering. During navigation, the user has to make decisions whether to collapse, expand, hide or reveal subsets of the data. Distance hints enable to see and analyze the entities at different

Page 32: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

distances from the selected node. We have defined few types of distance hints, each one of them revealing different information about the neighborhood of the selected node. Graph distances – the graph distances hint depict the amount of visible, hidden and collapsed nodes at each distance. The following example demonstrates the usage of the Graph Distance hint. Naively expanding the graph view from Figure 34(a) to Figure 34 (b), all the nodes in distance 2 from Madonna cause much clutter in the display. Using the distance hint in Figure 34 (a) shows that many expanded nodes are filtered out at distance 2 (red part of the bar in (a)). The solution would be to collapse these elements before the expansion Figure 34 (c) and then expand the graph to Figure 34 (d).

Figure 34: Graph distance hint Semantic distances – Is similar to the Graph Distance hint, except that, the distance metric used is the Semantic Distance instead of simple graph distance. The hint is helpful when the exploration is performed according to the Semantic Distance (as described in previous sections). Exploration using the semantic weights and consulting the Semantic Distances Hint helps the analyst focus on the relevant information and reduce clutter by expanding only nodes which are closer semantically.

Page 33: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Figure 35: Graph distance vs. Semantic distance. Distance by type – In semantic graphs exploration decisions are based not only on topology and cardinal complexity but also on semantic information contained in nodes and edges. Distance by type hint decomposes the entities at different distances according to their types enabling to see how many entities from each type are at every distance.

Page 34: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Figure 36: Distance by type hint. In this example it is possible to see that, for example, most of the entities at distance 4 from Madonna are from the type "Concept".

Node Degree Hint The degree of a node in a graph is an important indicator to its significance. However, in semantic graph not all edges and the nodes should be treated equally. For various tasks certain edge types and node types are more important than others. The node degree hint provides a way to examine the distribution of degrees of various nodes in a group of nodes. Using semantic filtering, as will be explained next, can also visually decompose the number of nodes according to edge types. The following example (Figure 37) demonstrates the node degree hint, decomposed according to the type of the connected entity. The hint enabled to see how many entities from each type are connected to each node inside the "Person" group.

Page 35: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Figure 37: Node Degree hint.

Semantic Filtering Very similar to filtering based on node types and edge types of the graphs created as a result of topological hints queries statistical queries can also be projected onto semantic subsets.

Page 36: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Figure 38: Filtering of Node Degree hint. The user is able to decide which entity types should be included in the statistical hint.

Contextual Hints There are times when the information contained in an inspected node is best depicted using some outside context. Such context can assist natural understanding and promotes insight. The right context obviously depends on the semantics, but two typical examples are temporal and spatial attributes of entities. To illustrate, we present two contextual visual hints. The first maps nodes in a sub-graph that carry a time-stamp onto a time line, and the second maps nodes that carry geo-positioning onto a real map.

Time-line Hint In the example shown in Figure 39, we examine all structures based on their time stamp. From this example it is easy to see that even though the data contains up-to-date structures from year 2000, most of the architects and buildings stored are from the 40’s and 50’s. Different than histograms, this hint enables interactive inspection of each separate entity displayed by clicking or using tool tips.

Figure 39: Timeline hint showing the distribution of structures on the time line

Positional Hint When the entities in a group contain spatial information such as geo-positioning, it is useful to display them out on a map, for example to inspect their spatial distribution. In our system we have implemented a positional-hint that maps entities with position attributes to a geographical map using Google static maps API [Goo03]. The example in Figure 40 displays information about the architect Antoni Gaudi and the structures he had constructed. Viewing either the summarized graph or semantic or topological

Page 37: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

hints, it is not possible to understand the spatial relationship of these structures. Geographical layout of the structures reveals the distribution of the structures in Barcelona. In our implementation of this hint we enable deeper investigation of the data by linking the entities directly to the maps web interface for further interactive investigation.

Figure 40: Positional hint for structured designed by Gaudi

Page 38: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Chapter 4 Implementation We have developed a graph visualization prototype system to test and evaluate the use of visual hints. The application is an interactive visualization tool enabling the user to explore a semantic graph. It was written in Java, using Prefuse visualization toolkit [HCK_05] and its automatic force-directed layout. The application enables loading any data-set together with the schema in the standard RDF format [RDF03]. The data is automatically clustered according to entity types defined by the schema. All the customizations and hint definitions for a specific data-set are carried by using the schema without the need for code generation. These include icons and colors representing entity types and the definition of attribute hints for each entity type. Any entity that has properties which are annotated with the “Date” annotation, for example, will support the Timeline hint. Entities that have longitude and the latitude properties will support the Positional hint. We have used Intel(R) Core(TM) 2 Duo CPU (2.20 GHz) with 2 GB of memory to run the application. Although the current implementation includes only few optimizations, it is possible to deal with graphs of several thousand of nodes and edges at interactive rates (see examples in the attached video).

Page 39: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Chapter 5 Results Data Sources The sources for Semantic Data are various. Many of them are available in the internet. We have used three main data sources of semantic graphs in our experiments illustrating different domains. DBPedia [DBP09] is a community effort to extract structured information from Wikipedia and to make this information available on the Web. The DBpedia data set contains about 882000 instances. In our work we used a subset of the data for the examples about musical artists and their works. MindSwap [MIN05] is a project of the Semantic Web Research Group of the MIND LAB at the University of Maryland Institute for Advanced Computer Studies. This website was built in order to explore how the Semantic Web can be used to analyze terrorist activity. We used it for in political science examples. FreeBase [Met09], created by Metaweb Technologies, is an open database of the world’s information. It’s built by the community and for the community – free for anyone to query, contribute to, build applications on top of, or integrate into their websites. Already, Freebase covers millions of topics in hundreds of categories. Drawing from large open data sets like Wikipedia, MusicBrainz, and the SEC archives, it contains structured information on many popular topics, including movies, music, people and locations – all reconciled and freely available via an open API.

Case Studies We present two case studies that are shown in the accompanied video. In the following text we refer to Figure 41 for some screen shots. Our first example uses the architecture data. Assume we investigate information about the architect Louis Sullivan. First, we use a distance hint filtered by type to learn about the neighborhood of the Sullivan entity in the graph. After expanding the graph a Peek-in Hint on the aggregated “structures” node Figure 41(a), enables to see the names of the specific structures designed by Sullivan, but it doesn’t show other attributes of the entities. With no visual hints, to see specific attribute values like construction year, each structure must be examined separately. Comparison of values is also time consuming and difficult. On the other hand, distribution hints present a succinct depiction of the semantic information that is hidden in the aggregated “structures” node. For example, using an attribute hint on the year value, Figure 41 (b) visualizes the construction date attribute of the structures and enables to see that they were designed between 1881-1922. Positional hint, on the other hand, displays the structures on a map according to the Geo-location attributes Figure 41 (c), enabling to analyze their spatial distribution. The positional hint also enables jump to an external application (Google Maps) for interactive investigation of their surroundings. Farther exploration of the aggregated graph enables to see that there are some people related to Sullivan structures, but it is impossible to see which ones exactly. Direct links hint Figure 41 (d) reveals all the external connections of the structures revealing which architects worked together with Sullivan on the structures.

Page 40: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Figure 41: Architects case study The Second example is taken from music domain. Assume we investigate the information about the work and relations of Madonna and Michael Jackson. We start by revealing all the nodes directly connected to Madonna and Jackson. We use the external link hint to inspect the aggregated people node. This topological hint enables to analyze the relations between different people and the musicians Figure 42(a). In the aggregated graph, it is impossible to understand whether Madonna and Jackson have any connection. Connectivity hint shows the nodes connecting them. It makes clear the fact that they are both related to Pop Music genre, and they were both born in the Unites States Figure 42(b). We can compare the works belonging to the two musicians using statistical Hints. Entity type hint enables to learn how many works from each type each artist has Figure 42(c). A Year attribute hint depicts the distribution of the different works across time. Displaying these hints side-by-side enables easy comparison between the aggregated semantics Figure 42(d).

Figure 42: Muscians case study

User Experiences It has already been discussed that evaluations of interactive visualization systems are difficult [Pla04]. There are numerous factors that may affect the results and separating them consistently is extremely complicated. Still, to validate our assumptions that visual hints assists both navigation and exploration we have designed a simple questionnaire with several exploration tasks based on the three data sets we collected: architecture, music and politics. We asked several users to fill in the questionnaire and measured their performance. Users were first introduced to the system showing them basic features such as how to zoom, pick, filter etc. Then the various visual hints were introduced and demonstrated. Following that, users were asked to practice the use of the system without any time limit, by answering one questionnaire freely. This concluded the training session. Next, users were asked to perform some of the experiments either with or without hints and their time for completion was measured. Finally, users completed some general questions about their experience. We tested two basic hypotheses. First, that visual hints can reduce the time to find specific information and answer questions in navigation. Second, that visual hints can increase the amount of information gathered in a given time while exploring the graph. To validate the first hypothesis we designed tests containing specific questions such as “In which years was Madonna most active?” for the music dataset or “What types of relations exist between people and organizations?” for the political data. We

Page 41: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

measured the time taken to answer these questions with or without using hints. To validate the second hypothesis we gave the users a limited amount of exploration time and asked him/her to find as many interesting facts as he/she can find about a specific entity (the architect Antoni Gaudi). We have run these studies on 7 participants. The full questionnaire and answers can be supplied as supplemental material. We have found that for specific questions answered with and without hints, the time to answer with hints was between 24% to 43% percent faster on average. However, some of the questions such as “In which years was Madonna most active?” were almost impossible to answer without hints and several participants just gave up. One can argue that questions related to the distribution of values are biased and impractical to answer without hints. Still, we feel this only emphasizes the need for combining statistics with graph navigation using hints. In the second part of the questionnaire we have found that people using hints found twice as many facts about Gaudi in the given time frame. When asked about their experiences all users graded the usefulness of hints as very high and were satisfied with their addition to a graph exploration system. Although only preliminary, we believe that these results indicate that visual hints are a useful tool for semantic graph exploration.

Page 42: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Chapter 6 Conclusions We have presented the notion of visual hints as a tool to assist semantic graph exploration. We defined three types of hints: topological, statistical and contextual. Using specific examples we demonstrated the effectiveness of using these hints to gain understanding and insight of the data. In the future we would like to further optimize some of the hint calculations to enable exploring huge graphs (million of nodes). We would like to investigate the possibility of formulating the definition of queries (both topological and statistical). This would simplify the definition of new visual hints and extend their possibilities. We are also interested in linking other types of data stored in semantic graphs to their correct context to define new contextual hints and promote inter-operability.

Page 43: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Bibliography [AK02] ABELLO, KORN: MGV: A system for visualizing massive multidigraphs. IEEE Transactions on Visualization and Computer Graphics 8 (2002). [BCE05] BARTHELEMY M., CHOW E., ELIASSI-RAD T.: Knowledge Representation Issues in Semantic Graphs for Relationship Detection,. AI Technologies for Homeland Security: Papers from the 2005 AAAI Spring Symposium, AAAI Press, pp. 91-98, 2005. [BD07] BALZER M., DEUSSEN O.: Level-of-detail visualization of clustered graph layouts. In APVIS (2007), Hong S.-H., Ma K.-L., (Eds.), IEEE, pp. 133–140. [Bor09a] BORGATTI S.: Netdraw. Analytic Technologies, http://www.analytictech.com/, 2009. [Bor09b] BORGATTI S.: UCINET 6. Analytic Technologies, http://www.analytictech.com/, 2009. [BW03] BRANDES U., WAGNER D.: Visone - analysis and visualization of social networks. In Graph Drawing Software (2003), Springer-Verlag, pp. 321–340. [CPC09] COLLINS C., PENN G., CARPENDALE S.: Bubble sets: Revealing set relations with isocontours over existing visualizations. IEEE Transactions on Visualization and Computer Graphics (Proceedings of the IEEE Conference on Information Visualization (InfoVis ’09)) 15, 6 (2009). [DBP09] : DBPedia. http://wiki.dbpedia.org, 2009. [DKS07] DELIGIANNIDIS L., KOCHUT K., SHETH A. P.: RDF data exploration and visualization. In CIMS (2007), Mitra P., Giles C. L., Carr L., (Eds.), ACM, pp. 39–46. [dNMB05] DE NOOY W., MRVAR A., BATAGELJ V.: Exploratory Social Network Analysis with Pajek. No. 27 in Structural Analysis in the Social Sciences. Cambridge University Press, Cambridge, 2005. [EH00] EADES P., HUANG M. L.: Navigating clustered graphs using force-directed methods. J. Graph Algorithms Appl 4, 3 (2000), 157–181. [FT09] FRISHMAN Y., TAL A.: Uncluttering graph layouts using anisotropic diffusion and mass transport. IEEE Transactions on Visualization and Computer Graphics 15, 5 (2009), 777–788. [FTjH05] FRASINCAR F., TELEA R., JAN HOUBEN G.: Adapting graph visualization techniques for the visualization of rdf data. In Visualizing the Semantic Web, 2006 (2005), pp. 154–171. [Fur86] FURNAS G. W.: Generalized fisheye views. In Proceedings of CHI ’86,

Page 44: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

Human Factors in Computing Systems (1986). [Goo03] GOOGLE: Google Static Maps API. http://code.google.com/apis/maps/documentation/staticmaps/, 2003. [HCK_05] HEER J., CARD, K. S., LANDAY, A. J.: prefuse: a toolkit for interactive information visualization. In Proceedings of ACM CHI 2005 Conference on Human Factors in Computing Systems (2005), vol. 1 of Interactive information visualization, pp. 421–430. [HHG09] HIRSCH C., HOSKING J., GRUNDY J.: Interactive visualization tools for exploring the semantic graph of large knowledge spaces. In Proceedings of the Visual Interfaces to the Social and the Semantic Web (VISSW 2009) (2009). [HMM00] HERMAN I., MELANCON G., MARSHALL M. S.: Graph visualization and navigation in information visualization: A survey. IEEE Transactions on Visualization and Computer Graphics 6, 1 (2000), 24–43. [JP02] JANECEK P., PU P.: A framework for designing fisheye views to support multiple semantic contexts. In AVI ’02: Proceedings of the Working Conference on Advanced Visual Interfaces (New York, NY, USA, 2002), ACM, pp. 51–58. [JP05] JANECEK P., PU P.: An evaluation of semantic fisheye views for opportunistic search in an annotated image collection. Int. J. on Digital Libraries 5, 1 (2005), 42–56. [Met09] METAWEB TECHNOLOGIES: FreeBase. http://www.freebase.com/, 2009. [MIN05] MIND LAB: MindSwap: Semantic Web Terrorism Knowledge Base. University of Maryland Institute for Advanced Computer Studies, http://profilesinterror.mindswap.org/, 2005. [Pla04] PLAISANT C.: The challenge of information visualization evaluation. In AVI ’04: Proceedings of the working conference on Advanced visual interfaces (New York, NY, USA, 2004), ACM, pp. 109–116. [PS06a] PERER A., SHNEIDERMAN B.: Balancing systematic and flexible exploration of social networks. IEEE Transactions on Visualization and Computer Graphics 12, 5 (2006), 693–700. [PS08] PERER A., SHNEIDERMAN B.: Integrating statistics and visualization: Case studies of gaining clarity during exploratory data analysis. In CHI ’08: Proceeding of the twenty-sixth annual SIGCHI conference on Human factors in computing systems (New York, NY, USA, 2008), ACM, pp. 265–274. [RDF03] Resource description framework, 2003. [SF08] SPRITZER A. S., FREITAS C. M. D. S.: A physics-based approach for interactive manipulation of graph visualizations. In AVI (2008), Levialdi S., (Ed.),

Page 45: Visual Hints for Semantic Graph Exploration · Visual Hints for Semantic Graph Exploration Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree of Tel-Aviv

ACM Press, pp. 271–278. [Shn96] SHNEIDERMAN B.: The eyes have it: A task by data type taxonomy for information visualizations. In VL (1996), pp. 336–343. [SMER06] SHEN Z., MA K.-L., ELIASSI-RAD T.: Visual analysis of large heterogeneous social networks by semantic and structural abstraction. IEEE Trans. Vis. Comput. Graph 12, 6 (2006), 1427–1439. [Sof09] SOFTWARE T. S.: Tom Sawyer Visualization. http://www.tomsawyer.com/, 2009. [TAvHS06] TOMINSKI C., ABELLO J., VAN HAM F., SCHUMANN H.: Fisheye tree views and lenses for graph visualization. In Proc. IEEE Information Visualization (InfoVis2006) (2006), IEEE CS Press, pp. 17–24. [THP08] TIAN Y., HANKINS R. A., PATEL J. M.: Efficient aggregation for graph summarization. In SIGMOD ’08: Proceedings of the 2008 ACM SIGMOD international conference on Management of data (New York, NY, USA, 2008), ACM, pp. 567–580. [WHA07] WILLETT W., HEER J., AGRAWALA M.: Scented widgets: Improving navigation cues with embedded visualizations. IEEE Trans. Vis. Comput. Graph 13, 6 (2007), 1129–1136.