SpRay - an R-based visual-analytics platform for …...Introduction SpRay Discussion Future Work...

20
Introduction SpRay Discussion Future Work SpRay an R-based visual-analytics platform for large and high-dimensional datasets J. Heinrich 1 J. Dietzsch 1 D. Bartz 2 K. Nieselt 1 1 Center for Bioinformatics, University of Tübingen 2 ICCAS/VCM, University of Leipzig August 12, 2008 useR! 2008 SpRay - an R-based visual-analytics platform

Transcript of SpRay - an R-based visual-analytics platform for …...Introduction SpRay Discussion Future Work...

Page 1: SpRay - an R-based visual-analytics platform for …...Introduction SpRay Discussion Future Work SpRay an R-based visual-analytics platform for large and high-dimensional datasets

IntroductionSpRay

DiscussionFuture Work

SpRayan R-based visual-analytics platform for large and

high-dimensional datasets

J. Heinrich1 J. Dietzsch1 D. Bartz2 K. Nieselt1

1Center for Bioinformatics, University of Tübingen

2ICCAS/VCM, University of Leipzig

August 12, 2008

useR! 2008 SpRay - an R-based visual-analytics platform

Page 2: SpRay - an R-based visual-analytics platform for …...Introduction SpRay Discussion Future Work SpRay an R-based visual-analytics platform for large and high-dimensional datasets

IntroductionSpRay

DiscussionFuture Work

Outline

1 Introduction

2 SpRay

3 Discussion

4 Future Work

useR! 2008 SpRay - an R-based visual-analytics platform

Page 3: SpRay - an R-based visual-analytics platform for …...Introduction SpRay Discussion Future Work SpRay an R-based visual-analytics platform for large and high-dimensional datasets

IntroductionSpRay

DiscussionFuture Work

High-Dimensional DataVisual AnalyticsRelated Work

Data Sets Become Increasingly Large

High-Throughput techniques yield a huge amount of dataMicroarraysCT scannerSimulation data

Many data sets are high-dimensionalTime series: 100 experiments, 5 replicates, 10000 oligos10000 rows × 500 columns = 5 · 106 data points

. . . and complexHeterogeneous data (categorical, metric)Invalid data (NA, NaN)

useR! 2008 SpRay - an R-based visual-analytics platform

Page 4: SpRay - an R-based visual-analytics platform for …...Introduction SpRay Discussion Future Work SpRay an R-based visual-analytics platform for large and high-dimensional datasets

IntroductionSpRay

DiscussionFuture Work

High-Dimensional DataVisual AnalyticsRelated Work

Knowledge Discovery Becomes Increasingly Difficult

Effects of Large and High-Dimensional Datasets for theAnalysis

Storage: obviousSpeed: time to read, locate, compute, render, display thedataQuality: errors, administrationComplexity: more variables, more detail, special cases. . .Visualization: Dimensionality, Occlusion, Identification

useR! 2008 SpRay - an R-based visual-analytics platform

Page 5: SpRay - an R-based visual-analytics platform for …...Introduction SpRay Discussion Future Work SpRay an R-based visual-analytics platform for large and high-dimensional datasets

IntroductionSpRay

DiscussionFuture Work

High-Dimensional DataVisual AnalyticsRelated Work

Visual Analytics with R

Analytical Reasoning

Gain insight into dataReveal underlying structure and modelExtract information contained

TechniquesData AnalysisVisualizationInteraction

useR! 2008 SpRay - an R-based visual-analytics platform

Page 6: SpRay - an R-based visual-analytics platform for …...Introduction SpRay Discussion Future Work SpRay an R-based visual-analytics platform for large and high-dimensional datasets

IntroductionSpRay

DiscussionFuture Work

High-Dimensional DataVisual AnalyticsRelated Work

Visual Analytics with RRelated Work

GGobi1 RGL2 iPlots3

- linked views - no linked views - linked views- CPU only - CPU/GPU - CPU/GPU- R optional - depends on R - depends on R

1[Swayne et al., 2003]2[Adler and Nenadic, 2003]3[Urbanek and Theus, 2003]

useR! 2008 SpRay - an R-based visual-analytics platform

Page 7: SpRay - an R-based visual-analytics platform for …...Introduction SpRay Discussion Future Work SpRay an R-based visual-analytics platform for large and high-dimensional datasets

IntroductionSpRay

DiscussionFuture Work

ImplementationPluginsPerformance

SpRayviSual exPloRation and anAlYsis of high-dimensional data

- linked views- CPU/GPU- R optional

useR! 2008 SpRay - an R-based visual-analytics platform

Page 8: SpRay - an R-based visual-analytics platform for …...Introduction SpRay Discussion Future Work SpRay an R-based visual-analytics platform for large and high-dimensional datasets

IntroductionSpRay

DiscussionFuture Work

ImplementationPluginsPerformance

SpRayObjectives

ObjectivesExtendableInteractivePortableStatistical BackendHigh-Performance

useR! 2008 SpRay - an R-based visual-analytics platform

Page 9: SpRay - an R-based visual-analytics platform for …...Introduction SpRay Discussion Future Work SpRay an R-based visual-analytics platform for large and high-dimensional datasets

IntroductionSpRay

DiscussionFuture Work

ImplementationPluginsPerformance

SpRayArchitecture

VisLibIndependent Visualization Library

Plugins

Implement the plugin-interfaceMake use of VisLib (optional)

Host ApplicationDefines the plugin-interfaceOrganizes communication

useR! 2008 SpRay - an R-based visual-analytics platform

Page 10: SpRay - an R-based visual-analytics platform for …...Introduction SpRay Discussion Future Work SpRay an R-based visual-analytics platform for large and high-dimensional datasets

IntroductionSpRay

DiscussionFuture Work

ImplementationPluginsPerformance

Plugins

Currently availableParallelCoordinatesScatterplotHistogramData TableTableLensR-ConsoleBrushing

useR! 2008 SpRay - an R-based visual-analytics platform

Page 11: SpRay - an R-based visual-analytics platform for …...Introduction SpRay Discussion Future Work SpRay an R-based visual-analytics platform for large and high-dimensional datasets

IntroductionSpRay

DiscussionFuture Work

ImplementationPluginsPerformance

Parallel Coordinates

useR! 2008 SpRay - an R-based visual-analytics platform

Page 12: SpRay - an R-based visual-analytics platform for …...Introduction SpRay Discussion Future Work SpRay an R-based visual-analytics platform for large and high-dimensional datasets

IntroductionSpRay

DiscussionFuture Work

ImplementationPluginsPerformance

Scatterplot

useR! 2008 SpRay - an R-based visual-analytics platform

Page 13: SpRay - an R-based visual-analytics platform for …...Introduction SpRay Discussion Future Work SpRay an R-based visual-analytics platform for large and high-dimensional datasets

IntroductionSpRay

DiscussionFuture Work

ImplementationPluginsPerformance

Data Table and R-Console

Data Table R-Console

useR! 2008 SpRay - an R-based visual-analytics platform

Page 14: SpRay - an R-based visual-analytics platform for …...Introduction SpRay Discussion Future Work SpRay an R-based visual-analytics platform for large and high-dimensional datasets

IntroductionSpRay

DiscussionFuture Work

ImplementationPluginsPerformance

TableLens

[Rao and Card, 1994]

useR! 2008 SpRay - an R-based visual-analytics platform

Page 15: SpRay - an R-based visual-analytics platform for …...Introduction SpRay Discussion Future Work SpRay an R-based visual-analytics platform for large and high-dimensional datasets

IntroductionSpRay

DiscussionFuture Work

ImplementationPluginsPerformance

Linking and Brushing

useR! 2008 SpRay - an R-based visual-analytics platform

Page 16: SpRay - an R-based visual-analytics platform for …...Introduction SpRay Discussion Future Work SpRay an R-based visual-analytics platform for large and high-dimensional datasets

IntroductionSpRay

DiscussionFuture Work

ImplementationPluginsPerformance

Performance

Depends onSize of the data setNumber of plugins loadedOperation in progressAvailable hardware (GPU?)

ResultsLower response times than GGobi/iPlots/RGL/MondrianGood performance for middle-sized datasets

useR! 2008 SpRay - an R-based visual-analytics platform

Page 17: SpRay - an R-based visual-analytics platform for …...Introduction SpRay Discussion Future Work SpRay an R-based visual-analytics platform for large and high-dimensional datasets

IntroductionSpRay

DiscussionFuture Work

Discussion

Objectives achievedExtendable Visual-Analytics-FrameworkIndependent Visualization LibraryHardware-accelerated GraphicsStatistical Backend using RInteractivityGood performance / Low response times

ProblemsRedundancy in frequently used calculationsVery basic interface to Rcategorical data only supported via the R-plugin

useR! 2008 SpRay - an R-based visual-analytics platform

Page 18: SpRay - an R-based visual-analytics platform for …...Introduction SpRay Discussion Future Work SpRay an R-based visual-analytics platform for large and high-dimensional datasets

IntroductionSpRay

DiscussionFuture Work

Future Work

Future WorkIncorporate meta-information into datamodel to avoidredundancy (e.g. maxima)Add/Improve plugins (Heatmap, 3D Plots, . . . )Extend interface to R (hot-linking, selections)Improve GPU-usage (textures, framebufferobjects . . . )

useR! 2008 SpRay - an R-based visual-analytics platform

Page 19: SpRay - an R-based visual-analytics platform for …...Introduction SpRay Discussion Future Work SpRay an R-based visual-analytics platform for large and high-dimensional datasets

IntroductionSpRay

DiscussionFuture Work

Thank You!

useR! 2008 SpRay - an R-based visual-analytics platform

Page 20: SpRay - an R-based visual-analytics platform for …...Introduction SpRay Discussion Future Work SpRay an R-based visual-analytics platform for large and high-dimensional datasets

IntroductionSpRay

DiscussionFuture Work

References I

Adler, D. and Nenadic, O. (2003).A Framework for an R to OpenGL Interface for Interactive 3D graphics.In Proc. of the 3rd International Workshop on Distributed Statistical Computing.

Rao, R. and Card, S. K. (1994).The table lens: merging graphical and symbolic representations in an interactive focus + context visualizationfor tabular i nformation.In Proc. of SIGCHI conference on Human factors in computing systems, pages 318–322, New York, NY,USA. ACM.

Swayne, D. F., Lang, D. T., Buja, A., and Cook, D. (2003).GGobi: evolving from XGobi into an extensible framework for interactive data visualization.Computational Statistics and Data Analysis, 43(4):423–444.

Urbanek, S. and Theus, M. (2003).iPlots - High Interaction Graphics for R.In Proc. of the 3rd International Workshop on Distributed Statistical Computing.

useR! 2008 SpRay - an R-based visual-analytics platform