Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research:...

47
Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented by Linda Henkel and Laura McSweeney of Fairfield University Funded by the Core Integration Initiative and the Center for Academic Excellence at Fairfield University

Transcript of Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research:...

Page 1: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Statistics for Everyone Workshop Fall 2010

Part 1

Statistics as a Tool in Scientific Research:

Summarizing & Graphically Representing Data

Workshop presented by Linda Henkel and Laura McSweeney of Fairfield University

Funded by the Core Integration Initiative and the Center for Academic Excellence at Fairfield University

Page 2: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Statistics as a Tool in Scientific Research

Types of Research Questions• Descriptive (What does X look like?)

• Correlational (Is there an association between X and Y? As X increases, what does Y do?)

• Experimental (Do changes in X cause changes in Y?)

Different statistical procedures allow us to answer the different kinds of research questions

Page 3: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Statistics as a Tool in Scientific Research

Start with the science and use statistics as a tool to answer the research question

Get your students to formulate a research question first:

• How often does this happen?

• Did all plants/people/chemicals act the same?

• What happens when I add more sunlight, give more praise, pour in more water?

Page 4: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Statistics as a Tool in Scientific Research

Can collect data in class

Can use already collected data (yours or database)

Helping students to formulate research question: Ask them to think about what would be interesting to know. What do they want to find out? What do they expect?

For example, what research questions might you ask from the survey?

Descriptive, correlational, experimental

Page 5: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Types of Data: Measurement Scales

Categorical: male/female

blood type (A, B, AB, O)

Stage 1, Stage 2, Stage 3 melanoma

Numerical: weight

# of white blood cells

mpg

Page 6: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Types of Data: Measurement Scales

Categorical:

• Nominal (name/label)

• Ordinal (rank order)

Numerical:

• Interval (equal intervals)

• Ratio (equal intervals and absolute zero)

Page 7: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Types of Data: Measurement Scales

Nominal: numbers are arbitrary; 1= male, 2 = female

Ordinal: numbers have order (i.e., more or less) but you do not know how much more or less; 1st place runner was faster but you do not know how much faster than 2nd place runner

Interval: numbers have order and equal intervals so you know how much more or less; A temperature of 102 is 2 points higher than one of 100

Ratio: same as interval but because there is an absolute zero you can talk meaningfully about twice as much and half as much; Weighing 200 pounds is twice as heavy as 100 pounds

Page 8: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Types of Data on Questionnaire1. What College are you from?

CAS Business Engineering Nursing Other  

2. How many years have you been teaching at Fairfield University?  3. How important do you think it is to integrate statistics into your

courses?Not at all Somewhat Important Important Very Important

4. How excited are you about integrating statistics into your courses?1 2 3 4 5 6 7

Not at all Extremely excited excited

Page 9: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Types of Data on Questionnaire5. Are you male or female?  

6. How many hours a week do you watch television on average?  7. How many hours a day do you spend on the internet on average?  8. Of the following reality/game shows, which one would you most

like to be on?(a) Dancing With the Stars (b) American Idol (c) Bachelor/Bachelorette (d) The Apprentice

 9. Can you roll your tongue? Yes No 10. How many siblings do you have?

Page 10: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Types of Statistical Procedures

Descriptive: Organize and summarize data

Inferential: Draw inferences about the relations between variables; use samples to generalize to population

Page 11: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Descriptive Statistics

The first step is ALWAYS getting to know your data

Summarize and visualize your data

It is a big mistake to just throw numbers into the computer and look at the output of a statistical test without any idea what those numbers are trying to tell you or without checking if the assumptions for the test are met.

Page 12: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Descriptive StatisticsNumerical Summaries:• Frequencies • Contingency tables • Measures of central tendency• Measures of variability • Representing numerical summaries in tables

Graphical Summaries: • Bar graphs or Pie graphs• Histograms• Scatterplots• Time series plot

Page 13: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Summarizing and Reporting Categorical Data

 Frequency = number of times each score occurs in a set of data

Relative Frequency = percent or proportion of times each score occurs in a set of data

  

Page 14: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Frequency Table

Marital StatusFrequency

(f)

Relative Frequency

(rel f)

Married 34 .14

Widowed 129 .54

Divorced 35 .15

Separated 30 .12

Never Married 13 .05

Total 241 1.00

Page 15: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Contingency Table

A display to summarize two categorical variables in a table.

Each entry in the table represents the number of observations in a sample with a certain outcome for the 2 variables.

Page 16: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Contingency Tables

Gender Binge Drinker

Non-binge

Drinker

Total

Male 1908 2017 3925

Female 2854 4125 6979

Total 4762 6142 10904

Page 17: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Contingency Tables

Gender Binge Drinker

Non-binge

Drinker

Total

Male 49% 51% 3925

Female 41% 59% 6979

Total 4762 6142 10904

Page 18: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Choosing the Appropriate Type of Graph

One categorical variable (e.g., Political party): Bar Chart or Pie Graph

Two categorical variables (e.g., Political party vs. Gender): Side-by-side Bar Chart

*Notice with 2 variables, one variable may be treated as the dependent variable and one variable may be treated as the independent variable.

Page 19: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Choosing the Appropriate Type of Graph

One numerical variable (e.g., Height): Histogram

One numerical variable and one categorical variable (e.g., Height vs. Gender): Side-by-side Histograms

Two paired numerical variables (e.g., Weight vs. Exercise per week): Scatterplot

One numerical variable over time (e.g., Number of Cells vs. Minutes): Time Series Plot

*Notice with 2 variables, one variable may be treated as the dependent variable and one variable may be treated as the independent variable.

Page 20: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Bar Graph (Frequency)

Page 21: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Bar Graph (Relative Frequency)

Page 22: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Pie Chart

Page 23: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Simple Frequency Tables and Bar Graphs

Page 24: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Side by Side Bar Charts

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

Binge Non-binge

Pro

po

rtio

ns

M

F

Conditioned Proportions

Binge Drinker

Nonbinge Drinker

Male 0.49 0.51

Female 0.41 0.59

Page 25: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Histogram of Simple Frequency Data

Page 26: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Size Frequency Distribution for Male Crabs

0

2

4

6

8

10

12

14

11.00-15.00

15.00-19.00

19.00-23.00

23.00-27.00

27.00-31.00

31.00-35.00

35.00-39.00

39.00-43.00

43.00-47.00

47.00-51.00

CW (mm)

Counts

Size Frequency Distribution for Female Crabs

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

11.00-15.00

15.00-19.00

19.00-23.00

23.00-27.00

27.00-31.00

31.00-35.00

35.00-39.00

39.00-43.00

43.00-47.00

47.00-51.00

CW (mm)

Cou

nts

Side by Side Histograms

Page 27: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Percent

Overweight

Threshold

of Pain

89 2

90 3

75 4

30 4.5

51 5.5

75 7

62 9

45 13

90 15

20 14

Scatterplot

Page 28: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Time Series Plot

Software File Updates

0

100

200

300

400

500

0 2 4 6 8 10 12 14

Month Number

Nu

mb

er o

f U

pd

ates

Month Number of Updates

1 323

2 268

3 290

4 405

5 383

6 368

... ...

12 75

Page 29: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Shapes of Distributions

• Normal (approximately symmetric)

• Skewed

• Unimodal/Bimodal/Uniform/Other

• Outliers

Page 30: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

The Normal Curve

“Bell-shaped”

Most scores in center, tapering off symmetrically in both tails

Amount of peakedness (kurtosis) can vary

Page 31: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Variations to Normal Distribution

Skew = Asymmetrical distribution

• Positive/right skew = greater frequency of low scores than high scores (longer tail on high end/right)

• Negative/left skew = greater frequency of high scores than low scores (longer tail on low end/left)

Page 32: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Histogram Showing Positive (Right) Skew

Page 33: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Variations to Normal Distribution

Bimodal distribution: two peaks

Rectangular/Uniform: all scores occur with equal frequency

Potential Outlier: An observation that is well above or below the overall bulk of the data

Important to determine normality (look at the histogram of the data) so you can choose appropriate measures of central tendency and variability

Page 34: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Variations to Normal Distribution

Bimodal Distribution (two peaks)

0

5

10

15

20

25

100-119 120-139 140-159 160-179 180-199 200-219 220-239

Weight

Nu

mb

er o

f P

eop

le

Rectangular/Uniform Distribution (equal # highs and lows)

0

2

4

6

8

10

12

14

16

100-119 120-139 140-159 160-179 180-199 200-219 220-239

Weight

Nu

mb

er

of

Pe

op

le

Potential Outlier in Distribution

02468

10121416

100-119

120-139

140-159

160-179

180-199

200-219

220-239

240-259

260-279

280-299

Weight

Nu

mb

er o

f P

eop

le

Page 35: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Examples of Bad Graphs: What is Wrong With the Picture?

7.7

7.8

7.9

8

8.1

8.2

8.3

8.4

On campus In town

Location

Po

llen

Co

un

t

Page 36: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Examples of Bad Graphs: What is Wrong With the Picture?

05

10152025

303540

4550

On campus In town

Location

Po

llen

Co

un

t

Page 37: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

A BETTER Graph

0

1

2

3

4

5

6

7

8

9

10

On campus In town

Location

Po

llen

Co

un

t

Page 38: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

00.5

1

1.52

2.53

3.5

44.5

5

2006 2007

Year

Ave

rag

e P

oll

en C

ou

nt

0

1

2

3

4

5

6

2000 2001 2002 2003 2004 2005 2006 2007 2008

Year

Po

llen

Co

un

t

02

468

10

121416

1820

2000 2001 2002 2003 2004 2005 2006 2007 2008

Year

Po

llen

Co

un

t

Page 39: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Example of a Bad Graph

Graph the distribution of the first digits.

First Digit

Frequency

1 109

2 75

3 77

4 99

5 72

6 117

7 89

8 62

9 43

0

20

40

60

80

100

120

140

1 2 3 4 5 6 7 8 9

FIRST

NUMBER

Page 40: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Example of a Bad Graph

Graph the distribution of the first digits.

First Digit

Frequency

1 109

2 75

3 77

4 99

5 72

6 117

7 89

8 62

9 43

0

20

40

60

80

100

120

140

1 2 3 4 5 6 7 8 9

NUMBER

Page 41: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Example of a GOOD Graph

Graph the distribution of the first digits.

First Digit

Frequency

1 109

2 75

3 77

4 99

5 72

6 117

7 89

8 62

9 43

Distribution of the First Digits

020406080

100120140

1 2 3 4 5 6 7 8 9

First Digit

Nu

mb

er o

f O

ccu

ran

ces

Page 42: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Example of a Bad Graph

Graph the distribution of the number of bacteria in the cultures sampled.

Number of Bacteria

41

33

43

52

46

37

44

49

53

30

Bacteria

0

10

20

30

40

50

60

1 2 3 4 5 6 7 8 9 10

BAC

Page 43: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Example of a Bad Graph

Graph the distribution of the number of bacteria in the cultures sampled.

Number of Bacteria

41

33

43

52

46

37

44

49

53

30

Distribution of Bacteria

0

0.5

1

1.5

2

2.5

30.00-33.00

33.00-36.00

36.00-39.00

39.00-42.00

42.00-45.00

45.00-48.00

48.00-51.00

51.00-54.00

54.00-57.00

57.00-60.00

Number of Bacteria

Num

ber

of

Sam

ple

s

Page 44: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Example of a GOOD Graph

Graph the distribution of the number of bacteria in the cultures sampled.

Number of Bacteria

41

33

43

52

46

37

44

49

53

30

Distribution of Bacteria

0

1

2

3

4

5

6

30.00-40.00

40.00-50.00

50.00-60.00

60.00-70.00

70.00-80.00

80.00-90.00

90.00-100.00

100.00-110.00

110.00-120.00

120.00-130.00

Number of Bacteria

Nu

mber

of

Sam

ple

s

Page 45: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Guidelines for Good Graphs

• Label both axes and provide a heading to make clear what the graph is representing.

• Vertical axes should usually start at 0 to help our eyes compare relative sizes.

• Remove any clutter that isn’t needed or is distracting

• The axes may need to be resized to remove extra white space

• Be careful in using unusual bars since it can be easy to get the relative percentages that the figures represent incorrect.

• Sometimes displaying information for more than one group on the same graph can be difficult especially when the values differ greatly. Consider using relative frequencies or separate graphs instead.

Page 46: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

Other Guidelines to Making Graphs

• Y axis should be ¾ as tall as X axis

• When the number of score values on X axis is large, scores should be collapsed so there are at least 5 intervals but no more than 12

• The width of each interval on the X axis should be equal

• Frequency on the Y axis must be continuous and regular

• Range on the Y axis and X axis must neither unduly compress nor unduly stretch the data

Page 47: Statistics for Everyone Workshop Fall 2010 Part 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Workshop presented.

• Looking at shape of distribution

• Making graphs

Teaching tips: • Hands-on practice is important for your

students• Sometimes working with a partner helps

Time to Practice