Linked Data Visualization Matt Bernier Joey Murphy David Coleman.

26
Linked Data Visualization Linked Data Visualization Matt Bernier Matt Bernier Joey Murphy Joey Murphy David Coleman David Coleman

Transcript of Linked Data Visualization Matt Bernier Joey Murphy David Coleman.

Page 1: Linked Data Visualization Matt Bernier Joey Murphy David Coleman.

Linked Data VisualizationLinked Data Visualization

Matt BernierMatt Bernier

Joey MurphyJoey Murphy

David ColemanDavid Coleman

Page 2: Linked Data Visualization Matt Bernier Joey Murphy David Coleman.

Needs AnalysisNeeds Analysis

Allow users to view data sets Allow users to view data sets graphically using intuitive and graphically using intuitive and efficient controlsefficient controls

Specifically to view links among data Specifically to view links among data pointspoints

Contemporary methods include: Contemporary methods include: diagrams, graphs, and listsdiagrams, graphs, and lists

Enable users to perform analysis on Enable users to perform analysis on their datatheir data

Page 3: Linked Data Visualization Matt Bernier Joey Murphy David Coleman.

Market AnalysisMarket Analysis

Linked data is present in several Linked data is present in several environments:environments:• Search engines (page ranking)Search engines (page ranking)• Social networks (recreational, academic, Social networks (recreational, academic,

professional)professional)• Other database-driven sitesOther database-driven sites• Computer networksComputer networks

Page 4: Linked Data Visualization Matt Bernier Joey Murphy David Coleman.

Market AnalysisMarket Analysis

Market UsersMarket Users• Website owners (50-100 million active Website owners (50-100 million active

domains, multiple sites per domain)domains, multiple sites per domain)• Enterprise internal site managersEnterprise internal site managers• Social networks operators and users Social networks operators and users

(more than 200 sites online)(more than 200 sites online)• Network administratorsNetwork administrators

Page 5: Linked Data Visualization Matt Bernier Joey Murphy David Coleman.

BackgroundBackground

Web sites supporting large amounts Web sites supporting large amounts of users are very popularof users are very popular

Finding common usage statistics can Finding common usage statistics can be very beneficialbe very beneficial• Purchasing similar productsPurchasing similar products• Participating in common discussionsParticipating in common discussions• Common browsing habitsCommon browsing habits

Page 6: Linked Data Visualization Matt Bernier Joey Murphy David Coleman.

BackgroundBackground

Showing Web links Showing Web links • How websites link togetherHow websites link together• Visualizing the webVisualizing the web

Visualizing any linked data setsVisualizing any linked data sets

Page 7: Linked Data Visualization Matt Bernier Joey Murphy David Coleman.

Linked Data ExampleLinked Data Example

Page 8: Linked Data Visualization Matt Bernier Joey Murphy David Coleman.

An Existing ApplicationAn Existing Application

Create Random Nodes

Page 9: Linked Data Visualization Matt Bernier Joey Murphy David Coleman.

Goals and ObjectivesGoals and Objectives

Overall goal is to create an intuitive Overall goal is to create an intuitive web based tool that allows users to web based tool that allows users to see links within their datasee links within their data

Allow users to analyze and infer Allow users to analyze and infer information from the linksinformation from the links

Making it easy for web programmers Making it easy for web programmers to implement the graph on their site to implement the graph on their site using a PHP class structureusing a PHP class structure

Page 10: Linked Data Visualization Matt Bernier Joey Murphy David Coleman.

ToolsTools

HTML,CSS (data presentation)HTML,CSS (data presentation) PHP (data objects, processing)PHP (data objects, processing) JavaScript (graph creation, JavaScript (graph creation,

interaction)interaction)• JSViz (framework for dynamic views, JSViz (framework for dynamic views,

Force-directed algorithm creates a Force-directed algorithm creates a graph that is graph that is aesthetically pleasingaesthetically pleasing))

Page 11: Linked Data Visualization Matt Bernier Joey Murphy David Coleman.

System DiagramSystem Diagram

Page 12: Linked Data Visualization Matt Bernier Joey Murphy David Coleman.

Literature ReviewLiterature ReviewGeneral IdeasGeneral Ideas

Building graphs from data setsBuilding graphs from data sets Displaying dataDisplaying data Data analysisData analysis

• Examining and inferring relationships Examining and inferring relationships • PredictionPrediction• Application to real worldApplication to real world

Page 13: Linked Data Visualization Matt Bernier Joey Murphy David Coleman.

Literature ReviewLiterature Review

Presenting data to usersPresenting data to users• Tree structures, Data -> InformationTree structures, Data -> Information• ““Inducing the chosen mental model in Inducing the chosen mental model in

the mind of the observer”the mind of the observer”• Easy to understandEasy to understand• Allows for more information to be Allows for more information to be

absorbed by observersabsorbed by observers

Aaron Kershenbaum and Keitha Murray. In Journal of Circuit Systems and Computers

Page 14: Linked Data Visualization Matt Bernier Joey Murphy David Coleman.

Literature ReviewLiterature Review

Many theories and techniques for Many theories and techniques for graph analysis, but not constructiongraph analysis, but not construction

Choice of nodes and linksChoice of nodes and links• What is represented by a node?What is represented by a node?• What is represented by a link?What is represented by a link?• Greatly influence meaning in a linked Greatly influence meaning in a linked

data displaydata display• e.g. hyperlinks, Enron email datasete.g. hyperlinks, Enron email dataset

A. Badia and M. Kantardzic. In Proceedings of the 3rd international workshop on Link discovery LinkKDD '05

J. Shetty and J. Adibi.  In KDD ’05

Page 15: Linked Data Visualization Matt Bernier Joey Murphy David Coleman.

Literature ReviewLiterature Review

Link Mining – analyzing linksLink Mining – analyzing links• Makes use of descriptive and predictive Makes use of descriptive and predictive

modeling (data mining)modeling (data mining)• e.g. determining webpage relevance e.g. determining webpage relevance

based on anchor text and surrounding based on anchor text and surrounding text of incoming hyperlinkstext of incoming hyperlinks

• e.g. segregating website users into e.g. segregating website users into groups based on common behavioursgroups based on common behaviours

L. Getoor. In ACM SIGKDD Explorations Newsletter, Vol. 5, Issue 1, 2003

Page 16: Linked Data Visualization Matt Bernier Joey Murphy David Coleman.

Literature ReviewLiterature Review

Link predictionLink prediction• Uses node proximityUses node proximity• ““Information about future interactions Information about future interactions

can be extracted from network topology can be extracted from network topology alone”alone”

• Predicting links that represent online Predicting links that represent online social interaction can help to determine social interaction can help to determine the feasibility of adding new interaction the feasibility of adding new interaction features to a sitefeatures to a site

D. Liben-Nowell and J. Kleinberg. In CIKM '03

Page 17: Linked Data Visualization Matt Bernier Joey Murphy David Coleman.

Patent AnalysisPatent Analysis

Computer-implemented system and Computer-implemented system and method for handling linked data views, method for handling linked data views, Patent number 7,068,267, held by SAS Patent number 7,068,267, held by SAS Institute Inc.Institute Inc.• A first view and a second view are used to A first view and a second view are used to

display at least a portion of the data display at least a portion of the data observations contained in the data model. observations contained in the data model. Conditional data that is associated with the Conditional data that is associated with the second view specifies how the second view's second view specifies how the second view's display is modified based upon a selection of a display is modified based upon a selection of a data observation within the first view. data observation within the first view.

Page 18: Linked Data Visualization Matt Bernier Joey Murphy David Coleman.

TimelineTimeline

Page 19: Linked Data Visualization Matt Bernier Joey Murphy David Coleman.

AdvantagesAdvantages

Design allows for customization Design allows for customization Custom data objectsCustom data objects Almost all visual aspects of the graph Almost all visual aspects of the graph

are easily changed or left as default are easily changed or left as default settingssettings

Page 20: Linked Data Visualization Matt Bernier Joey Murphy David Coleman.

DisadvantagesDisadvantages

Requires a network connection and a Requires a network connection and a browserbrowser• Or an Apache and PHP installation on a local Or an Apache and PHP installation on a local

machinemachine As dataset grows larger, application As dataset grows larger, application

performance may degradeperformance may degrade Possible Browser compatibility issuesPossible Browser compatibility issues

• These are typical web issues with HTML, These are typical web issues with HTML, JavaScript, and CSS renderingJavaScript, and CSS rendering

Page 21: Linked Data Visualization Matt Bernier Joey Murphy David Coleman.

Requirements AnalysisRequirements Analysis

Functionality (performance)Functionality (performance) FlexibilityFlexibility

• Allow users and developers to customize Allow users and developers to customize and deploy application as they see fitand deploy application as they see fit

ReliabilityReliability• Provide an accurate data representationProvide an accurate data representation

QualityQuality• Provide a meaningful, visual Provide a meaningful, visual

representation of datarepresentation of data

Page 22: Linked Data Visualization Matt Bernier Joey Murphy David Coleman.

Requirements AnalysisRequirements Analysis

Operating EnvironmentOperating Environment• Scripts:Scripts:

Run on a webserver with PHP (4.0+) Run on a webserver with PHP (4.0+) installationinstallation

Can interface with databasesCan interface with databases

• Users:Users: Cross-systemCross-system Cross-BrowserCross-Browser

Page 23: Linked Data Visualization Matt Bernier Joey Murphy David Coleman.

Requirements AnalysisRequirements Analysis

InterfacesInterfaces• A PHP class is provided, and the data to A PHP class is provided, and the data to

be visualized is added by the user.be visualized is added by the user. Performance RequirementsPerformance Requirements

• Time required to produce display varies Time required to produce display varies with size of datasetwith size of dataset

• 1-10 seconds1-10 seconds• Restrict size of datasets to prevent Restrict size of datasets to prevent

browser/computer from sufferingbrowser/computer from suffering

Page 24: Linked Data Visualization Matt Bernier Joey Murphy David Coleman.

Requirements AnalysisRequirements Analysis

ResourcesResources• Design was conceived prior to Design was conceived prior to

undertaking project, 10 man-hours to undertaking project, 10 man-hours to refine designrefine design

• Coding – 20 man-hoursCoding – 20 man-hours• Testing – 15 man-hoursTesting – 15 man-hours

Page 25: Linked Data Visualization Matt Bernier Joey Murphy David Coleman.

DemoDemo

Example

Page 26: Linked Data Visualization Matt Bernier Joey Murphy David Coleman.

FutureFuture

More complex displayMore complex display• Hyperlinks and/or pictures as nodesHyperlinks and/or pictures as nodes• Re-centering graph by clicking a nodeRe-centering graph by clicking a node• Mouse-over events for more detailMouse-over events for more detail