IAT 814 Data

60
Sep 23, 2013 IAT 814 1 IAT 814 Data _________________________________________________________________________________ _____ SCHOOL OF INTERACTIVE ARTS + TECHNOLOGY [SIAT] | WWW.SIAT.SFU.CA

description

IAT 814 Data. ______________________________________________________________________________________ SCHOOL OF INTERACTIVE ARTS + TECHNOLOGY [SIAT] | WWW.SIAT.SFU.CA. Agenda. Data forms and representations Basic representation techniques Multivariate (>3) techniques. Data Sets. - PowerPoint PPT Presentation

Transcript of IAT 814 Data

Page 1: IAT 814 Data

IAT 814 1Sep 23, 2013

IAT 814

Data

______________________________________________________________________________________

SCHOOL OF INTERACTIVE ARTS + TECHNOLOGY [SIAT] | WWW.SIAT.SFU.CA

Page 2: IAT 814 Data

IAT 814 2Sep 23, 2013

Agenda

• Data forms and representations• Basic representation techniques• Multivariate (>3) techniques

Page 3: IAT 814 Data

IAT 814 3Sep 23, 2013

Data Sets

• Data comes in many different forms• Typically, not in the way you want it

• How is stored (in the raw)?

Page 4: IAT 814 Data

IAT 814 4Sep 23, 2013

Example

• Cars– make– model– year– miles per gallon– cost– number of cylinders– weights– ...

Page 5: IAT 814 Data

IAT 814 5Sep 23, 2013

Data Tables

• Often, we take raw data and transform it into a form that is more workable

• Main idea:– Individual items are called cases– Cases have variables(attributes)

Page 6: IAT 814 Data

IAT 355 6Sep 23, 2013

Data Table Format

• Think of as a functionF(Case1) = <Val11, Val12,…>

Variable1 Variable2 Variable3

Case1 Value11 Value12 Value13

Case2 Value21 Value22 Value23

Case3 Value31 Value32 Value33

Dimensions

Page 7: IAT 814 Data

IAT 355 7Sep 23, 2013

Example student dataName Student Num Age Entered

SFUGPA

Mary 65432101 20 Sep 2010 4.0

Tom 98765651 22 Sep 2013 2.3

Louise 89846251 19 Jan 2012 3.1

Page 8: IAT 814 Data

IAT 814 8Sep 23, 2013

Variable Types

• Three main types of variables– N - Nominal (equal or not equal to other

values)• Example: gender

– O - Ordinal (obeys < relation, ordered set)• Example: mild, medium, hot, suicide

– Q - Quantitative (can do math on them)• Example: age

Page 9: IAT 814 Data

IAT 814 9Sep 23, 2013

Metadata

• Descriptive information about the data– Might be something as simple as the type of a

variable, or could be more complex– For times when the table itself just isn’t enough– Example: if variable1 is “l”, then variable3 can only

be 3, 7 or 16

Mary 65432101 20 Sep 2010 4.0

Tom 98765651 22 Sep 2013 2.3

Louise 89846251 19 Jan 2012 3.1

Page 10: IAT 814 Data

IAT 814 10Sep 23, 2013

How Many Variables?

• Data sets of dimensions 1, 2, 3 are common

• Number of variables per class– 1 - Univariate data– 2 - Bivariate data– 3 - Trivariate data– >3 - Hypervariate data

Page 11: IAT 814 Data

IAT 814 11Sep 23, 2013

Representation

• What’s a common way of visually representing multivariate data sets?

• Graphs! (not the vertex-edge ones)

Page 12: IAT 814 Data

IAT 814 12Sep 23, 2013

Basic Symbolic Displays

• Graphs • Charts• Maps• Diagrams

Page 13: IAT 814 Data

IAT 814 13Sep 23, 2013

Graphs

Page 14: IAT 814 Data

IAT 814 14Sep 23, 2013

Graphs• Visual display that

illustrates one or more relationships among entities

• Shorthand way to present information

• Allows a trend, pattern or comparison to be easily comprehended

0102030405060708090

100

1st Qtr 2nd Qtr 3rd Qtr 4th Qtr

EastWestNorth

Page 15: IAT 814 Data

IAT 814 15Sep 23, 2013

Issues

• Critical to focus on task– Why do you need a graph?– What questions are being

answered?– What data is needed to

answer those questions?– Who is the audience? 0

20

40

60

80

100

0 1 2 3 4 5

Time

Mon

ey

Page 16: IAT 814 Data

IAT 814 16Sep 23, 2013

Graph Components• Framework

– Measurement types, scale– Geometric Metadata

• Content– Marks, lines, points– Data

• Labels– Title, axes, ticks– Nominal Metadata

Page 17: IAT 814 Data

IAT 814 17Sep 23, 2013

Chart

• Structure is important, relates entities to each other

• Primarily uses lines, enclosure, position to link entities

• Flow charts, family tree, organization chart

A B

C

Page 18: IAT 814 Data

IAT 814 18Sep 23, 2013

Map

• Represents spatial relations• Locations identified by labels

– Nominal metadata

Page 19: IAT 814 Data

IAT 814 19Sep 23, 2013

Choropleth Map

• Areas are filled and colored differently to indicate some attribute of that region

Page 20: IAT 814 Data

IAT 814 20Sep 23, 2013

Cartography

• Cartographers and map-makers have a wealth of knowledge about the design and creation of visual information artifacts– Labeling, color, layout, …

Page 21: IAT 814 Data

IAT 814 21Sep 23, 2013

Diagram

• Schematic picture of object or entity

• Parts are symbolic– Examples: figures, steps in a

manual, illustrations …

Page 22: IAT 814 Data

IAT 814 22Sep 23, 2013

Details

• What are the constituent pieces of these four symbolic displays?

• What are the building blocks?

Page 23: IAT 814 Data

IAT 814 23Sep 23, 2013

Visual Structures

• Composed of– Spatial substrate– Marks– Graphical properties of marks

Page 24: IAT 814 Data

IAT 814 24Sep 23, 2013

Space

• Visually dominant• Often put axes on space to assist

perception of space• Use techniques of

composition, alignment, folding, recursion, overloading to1) increase use of space2) do data encodings

Page 25: IAT 814 Data

IAT 814 25Sep 23, 2013

Marks

• Things that occur in space– Points– Lines– Areas– Volumes

Page 26: IAT 814 Data

IAT 814 26Sep 23, 2013

Graphical Properties

• Size, shape, color, orientation...

Spatial Properties

Object Properties

Expressing Extent

Position, Size

Greyscale

Differentiating Marks

Orientation Color, Shape, Texture

Page 27: IAT 814 Data

IAT 814 27Sep 23, 2013

Data

• Number of variables per class1 - Univariate data2 - Bivariate data3 - Trivariate data>3 - Hypervariate data

Page 28: IAT 814 Data

IAT 814 28Sep 23, 2013

Univariate Data

Page 29: IAT 814 Data

IAT 814 29Sep 23, 2013

What goes where

• In univariate representations, we often think of the data case as being shown along one dimension, and the value in another

Y Axis is quantitative

Graph shows change in Y over continuous range X

Y Axis is quantitative

Graph shows value of Y for 4 cases

Page 30: IAT 814 Data

IAT 814 30Sep 23, 2013

Or…

• We may think of graph as representing independent (data case) and dependent (value) variables

• Guideline:– Independent vs. dependent variables

• Put independent on x-axis• See resultant dependent variables along y-axis

Page 31: IAT 814 Data

IAT 814 31Sep 23, 2013

Bivariate Data

• Representations– Scatter plot– Each mark is a data case– Want to see relationship between two

variables– What is the pattern?

Price

Mileage

Page 32: IAT 814 Data

IAT 814 32Sep 23, 2013

Trivariate Data

• 3D scatter plot may work– Must have 3D cues

• 3D blobs• motion parallax• stereoscopy

Price

Mileage

Horsepower

Page 33: IAT 814 Data

IAT 814 33Sep 23, 2013

Scatter Plot

• Use mark attribute for another variable

Price

Mileage

Price

Mileage

Page 34: IAT 814 Data

IAT 814 34Sep 23, 2013

Alternative 3D

• Represent each variable on its own line

Page 35: IAT 814 Data

IAT 814 35Sep 23, 2013

Hypervariate Data

• Number of well-known visualization techniques exist for data sets of 1-3 dimensions– line graphs, bar graphs, scatter plots OK– We see a 3-D world (4-D with time)

• What about data sets with more than 3 variables?– Often the interesting, challenging ones

Page 36: IAT 814 Data

IAT 814 36Sep 23, 2013

Multiple Views

A B C D1 1 6 7 92 9 12 9 123 6 8 6 74 8 6 6 8

1

2

3

4

A B C D

Each variable on its own line

Page 37: IAT 814 Data

IAT 814 37Sep 23, 2013

Scatterplot Matrix aka Splom

• Represent each possible pair of variables in their own 2-D scatterplot

• Useful for what?• Misses what?

Page 38: IAT 814 Data

Sep 23, 2013 IAT 814 38

Much More than 3D

• Fundamentally, we have 2 display dimensions

• For data sets with >2 variables, we must project data down to 2D

• Come up with visual mapping that locates each dimension into 2D plane

• Computer graphics 3D->2D projections

Page 39: IAT 814 Data

Sep 23, 2013 IAT 814 39

Spreadsheets

• Tables allocate a unique space per value– Case + Variable– 1 case per row– 1 variable per column

Page 40: IAT 814 Data

Sep 23, 2013 IAT 814 40

Parallel Coordinates

• Case 110

9

8

7

6

5

4

3

2

1

V1 V2 V3 V4 V5

V1 V2 V3 V4 V59 6 7 3 5

Page 41: IAT 814 Data

Sep 23, 2013 IAT 814 41

Parallel Coordinates

• Case 210

9

8

7

6

5

4

3

2

1

V1 V2 V3 V4 V5

V1 V2 V3 V4 V57 8 9 7 3

Page 42: IAT 814 Data

Sep 23, 2013 IAT 814 42

Parallel Coordinates

• Each column of space is assigned a variable

• Vertical Scale to left• Each data case is a polyline that puts a

vertex on each column at its corresponding data value

Page 43: IAT 814 Data

Sep 23, 2013 IAT 814 43

Parallel Coords Example

Page 44: IAT 814 Data

Sep 23, 2013 IAT 814 44

Issues

• The range of each variable can be different:– Must rescale to the vertical space

available– Each variable rescales independently

• Hard to read parallel coord plot as a static picture– Interaction required

Page 45: IAT 814 Data

Sep 23, 2013 IAT 814 45

Example Problem

• VLSI chip manufacture• Want high quality chips (high speed) and a

high yield batch (% of useful chips)• Able to track defects• Hypothesis: No defects gives desired chip

types• 473 batches of data

– A. Inselberg, “Multidimensional Detective” InfoVis 1997.

Page 46: IAT 814 Data

Sep 23, 2013 IAT 814 46

The Data

• 16 variables– X1 - yield– X2 - quality– X3-X12 - # defects (inverted)– X13-X16 - physical parameters

Page 47: IAT 814 Data

Sep 23, 2013 IAT 814 47

Yield Quality Defects Parameters

DistributionsYield: NormalQuality: Bimodal

Page 48: IAT 814 Data

Sep 23, 2013 IAT 814 48

Top Yield & Quality

• Split in parameters

Split

Page 49: IAT 814 Data

Sep 23, 2013 IAT 814 49

Minimal Defects

• Not best yield & quality

Page 50: IAT 814 Data

Sep 23, 2013 IAT 814 50

Best Yields

Parameters that give best yields cause 2 types of defects

Page 51: IAT 814 Data

Sep 23, 2013 IAT 814 51

XmdvTool

• Matt Ward, WPI

• Does Parallel Coords

Page 52: IAT 814 Data

Sep 23, 2013 IAT 814 52

Dimensional Reordering

• Which Dimensions are most alike?• Sort dimensions according to similarity

Page 53: IAT 814 Data

Sep 23, 2013 IAT 814 53

Advanced Graphics• Johanson et al InfoVis 2005

Page 54: IAT 814 Data

Sep 23, 2013 IAT 814 54

Use texture mapping • Pre-process data into clusters• Render each cluster to texture• Blend textures

Page 55: IAT 814 Data

Sep 23, 2013 IAT 814 55

Interaction with Parallel Coords

• Angular Query• Query the

difference• Hauser et al InfoVis 2002

Page 56: IAT 814 Data

Sep 23, 2013 IAT 814 56

Further elaboration

• Brush individual ranges• Display histogram per dimension

Page 57: IAT 814 Data

Sep 23, 2013 IAT 814 57

Visualizing Categories

• Titanic Disaster• Bendix et al InfoVis 2005

Page 58: IAT 814 Data

Sep 23, 2013 IAT 814 58

• Parallel coordinates layout – Continuous axes replaced with boxes

• Uses frequency based representation

Page 59: IAT 814 Data

Sep 23, 2013 IAT 814 59

Star Plots

• Space out the n variables at equal angles around a circle

• Each “spoke” encodes a variable’s value

• Data point is now a “shape”

Page 60: IAT 814 Data

Sep 23, 2013 IAT 814 60

Star Plot Examples