How to Produce Statistical Graphics General Clinical Research Center August 15, 2005 Rachel...

Post on 24-Dec-2015

217 views 0 download

Tags:

Transcript of How to Produce Statistical Graphics General Clinical Research Center August 15, 2005 Rachel...

How to Produce Statistical Graphics

General Clinical Research Center

August 15, 2005

Rachel Enriquez

What are we going to talk about?

Why should we care about statistical graphics?

What is the theoretical framework for statistical graphics?

When do we make statistical graphics? How can we produce good quality

graphics?

Why do we care about statistical graphics?

Interpretation - good graphs help you understand your data.

Data visualization is part of analysis

Communication of results is the last step in the scientific process Many people can comprehend the results

better by seeing them in a figure than they can by reading them in a table.

Do you have an opinion?Graphics can help persuade.Objectively correct graphics can call attention

to the result you WANT the viewer to see.

Get Attention

Can you produce exceptional statistical graphics? (me neither)

Do you want people to know that you are committed to the scientific process?

If people understand your research, they’ll listen to you and do what you tell them to.

The Theory of Statistical Graphics

Data Visualization

Visually encode the data. Viewers decode the picture

Easy to figure outLearn something newSee the right comparisons

Hierarchy of Visual PerceptionPosition along common scale

Position along nonaligned scale

Length

Angle / Slope

Area

Color

Volume

Aesthetics

A personal matter Unless you ask Tufte Data / Ink ratio Avoid 3-D Fill patterns are bad Obtain good

resolution Text can be small

(in print)

Aesthetics

When do we make statistical graphics?

For preliminary analysis

Speed

16 24 32 40 48 56 64 72

0

5

10

15

20

25

30

35

Percent

Age

For Publication in Journals

Data density is good. Excellent resolution is

required. Color is difficult. Column width is a

consideration. MS office is frequently not

an option. Too many tables!

A plot is better

Confounding variable

TABLES - Consider the on-line supplement

Maybe…

Frequency of Reflux Symptoms1/wk 2-3/wk >3/wk

0

5

10

15

20

25

30

35

Esophageal Squamous Cell CarcinomaAdenocarcinoma of Gastric CardiaEsophageal Adenocarcinoma

1-2 points 2.5-4 points 4.5-6.5 points

Odd

s Ra

tio

RefluxSymptomScore Duration of Reflux<12 yr 12-20 yr >20 yr

5

10

15

20

25

30

35

5

10

15

20

25

30

VariableEsophageal

Adenocarcinom aAdenocarcinoma

of the G astric Cardia

Fre que ncy ofreflux symptoms

Re flu x symp to mdcore

D uration ofreflux symptoms

0

5

10

15

20

25

30

35

0

5

10

15

20

25

30

35

0

5

10

15

20

25

30

35

0

5

10

15

20

25

30

35

0

5

10

15

20

25

30

35

0

5

10

15

20

25

30

35

0

5

10

15

20

25

30

35

0

5

10

15

20

25

30

35

0

5

10

15

20

25

30

35

Esophageal Squamous-Cell Carcinom a

1/week 2-3/week >3/week 1/week 2-3/week >3/week 1/week 2-3/week >3/week

1 - 2 2.5 - 4 4.5 - 6.5 1 - 2 2.5 - 4 4.5 - 6.5 1 - 2 2.5 - 4 4.5 - 6.5

<12 yrs 12-20 yrs >20 yrs <12 yrs 12-20 yrs >20 yrs <12 yrs 12-20 yrs >20 yrs

The figure should be labeled!

Oral Presentations

HPR223 2004

The boxplot (3)

The space between the mean and the median shows you the data is not normally distributed.

If the data are normally distributed, 95% of the values should be inside of the upper fence (1.5 IQRs); 99% should be inside of 3 IQRs.

Data density should be moderate.

Color is available.

LABEL! Hope you

have interesting data

Posters

Smaller audience

Experimentation is good.

Graphics will bring you customers!

Experimentation may, or may NOT work.

How Do I do this?

How much time do you have?

It is not easy. There is no perfect, easy to use, cheap

software that is going to solve your problems.

This is not too hard

Books are not very helpful

Software changes quickly.

People use different software.

You want to do it NOW, not after reading for 5 hours.

Surfing the net is frequently useful.

Vector Graphics vs Bitmaps Vector graphics.

A set of instructions that tells the device how to display the document.

Adobe software is the most common way to edit vector graphics.

Bitmaps Resolution depends on the size of the computer file. Easy to open and publish on-line. Generally not accepted for publication.

Vector graphics can be made into bitmaps. Bitmaps cannot be made in vector graphics.

Bitmaps, compression, and enlarging Compression can be

‘lossy’

We are familiar with the grainy effect of enlargement.

Software SPSS

Many chart options Graphics can be edited Can export vector graphics.

SAS Known for poor graphics. However, some people produce very good graphs

with SAS. Hope SAS improves and use something else for now?

Stata Any comments?

R It is free. Produces good graphics that can be exported in various

formats. Infinitely customizable Difficult for the novice statistician / programmer R clinic

SyStat EpiInfo S+ Spotfire Prism – also available in GCRC computer lab. Others…..

Sigma Plot

Can be used with Excel and SPSS Opens other data formats Menu driven Multiple graphics options Easily produces compound graphics Exports graphics in multiple formats.

MS Office Windows Metafile is a vector graphic format. Excel

More control over graphics Limited selection of graph types User typically provides the S.E.s and effect estimates.

PowerPoint Surprisingly good at managing bitmaps. If you already use it, then improve your graphics by

applying aesthetic rules.

For example…

0

1

2

3

4

5

6

negative family history postive family historyhay fever asthma eczema current wheeze 0

1

2

3

4

5

6

negative family history postive family history

hay fever

asthma

eczema

current wheeze

Scanners

Scanned figures are an option. Good way to clean up figures from journals if you’re

proficient in Photoshop The bitmap resolution problem remains Which file format and program will avoid lossy

compression?

Art Software

As a novice graphic preparator, I appreciate the ability to draw on graphs.

Can also ‘cover’ unwanted parts with white shapes.

Group the resulting collection of shapes and save as a picture.

Adobe Illustrator Adobe

Photoshop These programs

may seem counter-intuitive at first use.

Paint, MS office, etc. Easy to use Bitmap

products.

Call the experts

The Medical Illustrators at VUMC will improve your graphs. $50/hr Average graph is

20 minutes. Grow your own

group ‘expert’.