Statistical Analysis

83
Statistical Analysis IB Diploma Biology Stephen Taylor Image: 'Hummingbird Checks Out Flower' http://www.flickr.com/photos/25659032@N07/7200193254 Found on flickrcc .net
  • date post

    13-Sep-2014
  • Category

    Education

  • view

    112
  • download

    1

description

For the IB Diploma Programme Biology course.

Transcript of Statistical Analysis

Page 1: Statistical Analysis

Statistical AnalysisIB Diploma Biology

Stephen Taylor

Image Hummingbird Checks Out Flower httpwwwflickrcomphotos25659032N077200193254 Found on flickrcc net

Assessment Statements Obj

111State that error bars are a graphical representation of the variability of data Range and standard deviation show the variability spread in the data 95 Confidence Interval error bars suggest significance of difference where there is no

overlap 1

112Calculate the mean and standard deviation of a set of values Using Excel (Formula =STDEV(rawdata)) Using your calculator

2

113State that the term standard deviation (s) is used to summarize the spread of values around the mean and that 68 of all data fall within (plusmn) 1 standard deviation of the mean

1

114Explain how the standard deviation is useful for comparing the means and the spread of data between two or more samples A greater standard deviation shows a greater variability of data around the mean This can be used to infer reliability in methods or results

3

115Deduce the significance of the difference between two sets of data using calculated values for t and the appropriate tables Using t-values t-tables and critical values Directly calculating P values using Excel in lab reports

3

116 Explain that the existence of a correlation does not establish that there is a causal relationship between two variables 3

Assessment statements from Online IB Biology Subject GuideCommand terms httpi-biologynetibdpbiocommand-terms

MrTrsquos Excel Statbookhas guidance and lsquoliversquo examples of tables graphs and statistical tests

httpi-biologynetict-in-ib-biologyspreadsheets-graphingstatexcel

ldquoWhy is this BiologyrdquoVariation in populations

Variability in results

affects

Confidence in conclusions

The key methodology in Biology is hypothesis testing through experimentation

Carefully-designed and controlled experiments and surveys give us quantitative

(numeric) data that can be compared

We can use the data collected to test our hypothesis and form explanations of the

processes involvedhellip but only if we can be confident in our results

We therefore need to be able to evaluate the reliability of a set of data and the significance of any differences we have found in the data

Image Transverse section of part of a stem of a Dead-nettle (Lamium sp) showing+a+vascular+bundle+and+part+of+the+cortex httpwwwflickrcomphotos71183136N086959590092 Found on flickrccnet

ldquoWhich medicine should I prescriberdquo

Image from httpwwwmsforginternational-activity-report-2010-sierra-leoneDonate to Medecins Sans Friontiers through Biology4Good httpi-biologynetaboutbiology4good

ldquoWhich medicine should I prescriberdquo

Image from httpwwwmsforginternational-activity-report-2010-sierra-leoneDonate to Medecins Sans Friontiers through Biology4Good httpi-biologynetaboutbiology4good

Generic drugs are out-of-patent and are much cheaper than the proprietary (brand-name) equivalents Doctors need to balance needs with available resources Which would you choose

ldquoWhich medicine should I prescriberdquo

Image from httpwwwmsforginternational-activity-report-2010-sierra-leoneDonate to Medecins Sans Friontiers through Biology4Good httpi-biologynetaboutbiology4good

Means (averages) in Biology are almost never good enough Biological systems (and our results) show variability

Which would you choose now

Hummingbirds are nectarivores (herbivores that feed on the nectar of some species of flower)

In return for food they pollinate the flower This is an example of mutualism ndash benefit for all

As a result of natural selection hummingbird bills have evolved

Birds with a bill best suited to their preferred food source have

the greater chance of survival

Photo Archilochus colubris from wikimedia commons by Dick Daniels

Researchers studying comparative anatomy collect data on bill-length in two species of hummingbirds Archilochus colubris (red-throated hummingbird) and Cynanthus latirostris (broadbilled hummingbird)

To do this they need to collect sufficientrelevant reliable data so they can testthe Null hypothesis (H0) that

ldquothere is no significant difference in bill length between the two speciesrdquo

Photo Archilochus colubris (male) wikimedia commons by Joe Schneid

The sample size must be large enough to provide

sufficient reliable data and for us to carry out relevant statistical

tests for significance

We must also be mindful of uncertainty in our measuring tools

and error in our results

Photo Broadbilled hummingbird (wikimedia commons)

The mean is a measure of the central tendency of a set of data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean      s           

Calculate the mean using bull Your calculator (sum of values n)

bull Excel

=AVERAGE(highlight raw data)

n = sample size The bigger the better In this case n=10 for each group

All values should be centred in the cell with decimal places consistent with the measuring tool uncertainty

The mean is a measure of the central tendency of a set of data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s

       

Raw data and the mean need to have consistent decimal places (in line with uncertainty of the measuring tool)

Uncertainties must be included

Descriptive table title and number

DELETE

X

DELETE

X

00

20

40

60

80

100

120

140

160

180

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C latirostris

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Descriptive title with graph number

Labeled point

Y-axis clearly labeled with uncertainty

Make sure that the y-axis begins at zero

x-axis labeled

00

20

40

60

80

100

120

140

160

180

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C latirostris

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

From the means alone you might conclude that C latirostris has a longer bill than A colubris

But the mean only tells part of the story

httpclick4biologyinfoc4b1gcStathtm

httpmathbitscomMathBitsTINSectionStatistics1Spreadsheethtml

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

C latirostris

A colubris

Standard deviation is a measure of the spread of most of the data Error bars are a graphical representation of the variability of data

Which of the two sets of data has

a The highest mean

b The greatest variability in the data

A

B

Error bars could represent standard deviation range or confidence intervals

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Delete the horizontal error bars

00

50

100

150

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris (error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Title is adjusted to show the source of the error bars This is very important

You can see the clear difference in the size of the error bars

Variability has been visualised

The error bars overlap somewhat

What does this mean

The overlap of a set of error bars gives a clue as to the significance of the difference between two sets of data

Large overlap No overlap

Lots of shared data points within each data set

Results are not likely to be significantly different from each other

Any difference is most likely due to chance

No (or very few) shared data points within each data set

Results are more likely to be significantly different from each other

The difference is more likely to be lsquorealrsquo

-30

20

70

120

170

220

A colubris 159mm(n=10)

C latirostris 188mm(n=10)

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris(error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Our results show a very small overlap between the two sets of data

So how do we know if the difference is significant or not

We need to use a statistical test

The t-test is a statistical test that helps us determine the significance of the difference between the means of two sets of data

The Null Hypothesis (H0)

ldquoThere is no significant differencerdquo

This is the lsquodefaultrsquo hypothesis that we always testIn our conclusion we either accept the null hypothesis or reject it

A t-test can be used to test whether the difference between two means is significant bull If we accept H0 then the means are not significantly different bull If we reject H0 then the means are significantly different

Rememberbull We are never lsquotryingrsquo to get a difference We design carefully-controlled experiments and

then analyse the results using statistical analysis

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

What happens to the value of P as the confidence in the results increases

What happens to the critical value as the confidence level increases

ldquocritical valuesrdquo

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

We usually use Plt005 (95 confidence) in Biology as our data can be highly variable

Simple explanation we are working in two directions ndash within each population and across populations

ldquocritical valuesrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 2: Statistical Analysis

Assessment Statements Obj

111State that error bars are a graphical representation of the variability of data Range and standard deviation show the variability spread in the data 95 Confidence Interval error bars suggest significance of difference where there is no

overlap 1

112Calculate the mean and standard deviation of a set of values Using Excel (Formula =STDEV(rawdata)) Using your calculator

2

113State that the term standard deviation (s) is used to summarize the spread of values around the mean and that 68 of all data fall within (plusmn) 1 standard deviation of the mean

1

114Explain how the standard deviation is useful for comparing the means and the spread of data between two or more samples A greater standard deviation shows a greater variability of data around the mean This can be used to infer reliability in methods or results

3

115Deduce the significance of the difference between two sets of data using calculated values for t and the appropriate tables Using t-values t-tables and critical values Directly calculating P values using Excel in lab reports

3

116 Explain that the existence of a correlation does not establish that there is a causal relationship between two variables 3

Assessment statements from Online IB Biology Subject GuideCommand terms httpi-biologynetibdpbiocommand-terms

MrTrsquos Excel Statbookhas guidance and lsquoliversquo examples of tables graphs and statistical tests

httpi-biologynetict-in-ib-biologyspreadsheets-graphingstatexcel

ldquoWhy is this BiologyrdquoVariation in populations

Variability in results

affects

Confidence in conclusions

The key methodology in Biology is hypothesis testing through experimentation

Carefully-designed and controlled experiments and surveys give us quantitative

(numeric) data that can be compared

We can use the data collected to test our hypothesis and form explanations of the

processes involvedhellip but only if we can be confident in our results

We therefore need to be able to evaluate the reliability of a set of data and the significance of any differences we have found in the data

Image Transverse section of part of a stem of a Dead-nettle (Lamium sp) showing+a+vascular+bundle+and+part+of+the+cortex httpwwwflickrcomphotos71183136N086959590092 Found on flickrccnet

ldquoWhich medicine should I prescriberdquo

Image from httpwwwmsforginternational-activity-report-2010-sierra-leoneDonate to Medecins Sans Friontiers through Biology4Good httpi-biologynetaboutbiology4good

ldquoWhich medicine should I prescriberdquo

Image from httpwwwmsforginternational-activity-report-2010-sierra-leoneDonate to Medecins Sans Friontiers through Biology4Good httpi-biologynetaboutbiology4good

Generic drugs are out-of-patent and are much cheaper than the proprietary (brand-name) equivalents Doctors need to balance needs with available resources Which would you choose

ldquoWhich medicine should I prescriberdquo

Image from httpwwwmsforginternational-activity-report-2010-sierra-leoneDonate to Medecins Sans Friontiers through Biology4Good httpi-biologynetaboutbiology4good

Means (averages) in Biology are almost never good enough Biological systems (and our results) show variability

Which would you choose now

Hummingbirds are nectarivores (herbivores that feed on the nectar of some species of flower)

In return for food they pollinate the flower This is an example of mutualism ndash benefit for all

As a result of natural selection hummingbird bills have evolved

Birds with a bill best suited to their preferred food source have

the greater chance of survival

Photo Archilochus colubris from wikimedia commons by Dick Daniels

Researchers studying comparative anatomy collect data on bill-length in two species of hummingbirds Archilochus colubris (red-throated hummingbird) and Cynanthus latirostris (broadbilled hummingbird)

To do this they need to collect sufficientrelevant reliable data so they can testthe Null hypothesis (H0) that

ldquothere is no significant difference in bill length between the two speciesrdquo

Photo Archilochus colubris (male) wikimedia commons by Joe Schneid

The sample size must be large enough to provide

sufficient reliable data and for us to carry out relevant statistical

tests for significance

We must also be mindful of uncertainty in our measuring tools

and error in our results

Photo Broadbilled hummingbird (wikimedia commons)

The mean is a measure of the central tendency of a set of data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean      s           

Calculate the mean using bull Your calculator (sum of values n)

bull Excel

=AVERAGE(highlight raw data)

n = sample size The bigger the better In this case n=10 for each group

All values should be centred in the cell with decimal places consistent with the measuring tool uncertainty

The mean is a measure of the central tendency of a set of data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s

       

Raw data and the mean need to have consistent decimal places (in line with uncertainty of the measuring tool)

Uncertainties must be included

Descriptive table title and number

DELETE

X

DELETE

X

00

20

40

60

80

100

120

140

160

180

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C latirostris

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Descriptive title with graph number

Labeled point

Y-axis clearly labeled with uncertainty

Make sure that the y-axis begins at zero

x-axis labeled

00

20

40

60

80

100

120

140

160

180

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C latirostris

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

From the means alone you might conclude that C latirostris has a longer bill than A colubris

But the mean only tells part of the story

httpclick4biologyinfoc4b1gcStathtm

httpmathbitscomMathBitsTINSectionStatistics1Spreadsheethtml

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

C latirostris

A colubris

Standard deviation is a measure of the spread of most of the data Error bars are a graphical representation of the variability of data

Which of the two sets of data has

a The highest mean

b The greatest variability in the data

A

B

Error bars could represent standard deviation range or confidence intervals

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Delete the horizontal error bars

00

50

100

150

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris (error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Title is adjusted to show the source of the error bars This is very important

You can see the clear difference in the size of the error bars

Variability has been visualised

The error bars overlap somewhat

What does this mean

The overlap of a set of error bars gives a clue as to the significance of the difference between two sets of data

Large overlap No overlap

Lots of shared data points within each data set

Results are not likely to be significantly different from each other

Any difference is most likely due to chance

No (or very few) shared data points within each data set

Results are more likely to be significantly different from each other

The difference is more likely to be lsquorealrsquo

-30

20

70

120

170

220

A colubris 159mm(n=10)

C latirostris 188mm(n=10)

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris(error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Our results show a very small overlap between the two sets of data

So how do we know if the difference is significant or not

We need to use a statistical test

The t-test is a statistical test that helps us determine the significance of the difference between the means of two sets of data

The Null Hypothesis (H0)

ldquoThere is no significant differencerdquo

This is the lsquodefaultrsquo hypothesis that we always testIn our conclusion we either accept the null hypothesis or reject it

A t-test can be used to test whether the difference between two means is significant bull If we accept H0 then the means are not significantly different bull If we reject H0 then the means are significantly different

Rememberbull We are never lsquotryingrsquo to get a difference We design carefully-controlled experiments and

then analyse the results using statistical analysis

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

What happens to the value of P as the confidence in the results increases

What happens to the critical value as the confidence level increases

ldquocritical valuesrdquo

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

We usually use Plt005 (95 confidence) in Biology as our data can be highly variable

Simple explanation we are working in two directions ndash within each population and across populations

ldquocritical valuesrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 3: Statistical Analysis

MrTrsquos Excel Statbookhas guidance and lsquoliversquo examples of tables graphs and statistical tests

httpi-biologynetict-in-ib-biologyspreadsheets-graphingstatexcel

ldquoWhy is this BiologyrdquoVariation in populations

Variability in results

affects

Confidence in conclusions

The key methodology in Biology is hypothesis testing through experimentation

Carefully-designed and controlled experiments and surveys give us quantitative

(numeric) data that can be compared

We can use the data collected to test our hypothesis and form explanations of the

processes involvedhellip but only if we can be confident in our results

We therefore need to be able to evaluate the reliability of a set of data and the significance of any differences we have found in the data

Image Transverse section of part of a stem of a Dead-nettle (Lamium sp) showing+a+vascular+bundle+and+part+of+the+cortex httpwwwflickrcomphotos71183136N086959590092 Found on flickrccnet

ldquoWhich medicine should I prescriberdquo

Image from httpwwwmsforginternational-activity-report-2010-sierra-leoneDonate to Medecins Sans Friontiers through Biology4Good httpi-biologynetaboutbiology4good

ldquoWhich medicine should I prescriberdquo

Image from httpwwwmsforginternational-activity-report-2010-sierra-leoneDonate to Medecins Sans Friontiers through Biology4Good httpi-biologynetaboutbiology4good

Generic drugs are out-of-patent and are much cheaper than the proprietary (brand-name) equivalents Doctors need to balance needs with available resources Which would you choose

ldquoWhich medicine should I prescriberdquo

Image from httpwwwmsforginternational-activity-report-2010-sierra-leoneDonate to Medecins Sans Friontiers through Biology4Good httpi-biologynetaboutbiology4good

Means (averages) in Biology are almost never good enough Biological systems (and our results) show variability

Which would you choose now

Hummingbirds are nectarivores (herbivores that feed on the nectar of some species of flower)

In return for food they pollinate the flower This is an example of mutualism ndash benefit for all

As a result of natural selection hummingbird bills have evolved

Birds with a bill best suited to their preferred food source have

the greater chance of survival

Photo Archilochus colubris from wikimedia commons by Dick Daniels

Researchers studying comparative anatomy collect data on bill-length in two species of hummingbirds Archilochus colubris (red-throated hummingbird) and Cynanthus latirostris (broadbilled hummingbird)

To do this they need to collect sufficientrelevant reliable data so they can testthe Null hypothesis (H0) that

ldquothere is no significant difference in bill length between the two speciesrdquo

Photo Archilochus colubris (male) wikimedia commons by Joe Schneid

The sample size must be large enough to provide

sufficient reliable data and for us to carry out relevant statistical

tests for significance

We must also be mindful of uncertainty in our measuring tools

and error in our results

Photo Broadbilled hummingbird (wikimedia commons)

The mean is a measure of the central tendency of a set of data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean      s           

Calculate the mean using bull Your calculator (sum of values n)

bull Excel

=AVERAGE(highlight raw data)

n = sample size The bigger the better In this case n=10 for each group

All values should be centred in the cell with decimal places consistent with the measuring tool uncertainty

The mean is a measure of the central tendency of a set of data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s

       

Raw data and the mean need to have consistent decimal places (in line with uncertainty of the measuring tool)

Uncertainties must be included

Descriptive table title and number

DELETE

X

DELETE

X

00

20

40

60

80

100

120

140

160

180

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C latirostris

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Descriptive title with graph number

Labeled point

Y-axis clearly labeled with uncertainty

Make sure that the y-axis begins at zero

x-axis labeled

00

20

40

60

80

100

120

140

160

180

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C latirostris

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

From the means alone you might conclude that C latirostris has a longer bill than A colubris

But the mean only tells part of the story

httpclick4biologyinfoc4b1gcStathtm

httpmathbitscomMathBitsTINSectionStatistics1Spreadsheethtml

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

C latirostris

A colubris

Standard deviation is a measure of the spread of most of the data Error bars are a graphical representation of the variability of data

Which of the two sets of data has

a The highest mean

b The greatest variability in the data

A

B

Error bars could represent standard deviation range or confidence intervals

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Delete the horizontal error bars

00

50

100

150

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris (error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Title is adjusted to show the source of the error bars This is very important

You can see the clear difference in the size of the error bars

Variability has been visualised

The error bars overlap somewhat

What does this mean

The overlap of a set of error bars gives a clue as to the significance of the difference between two sets of data

Large overlap No overlap

Lots of shared data points within each data set

Results are not likely to be significantly different from each other

Any difference is most likely due to chance

No (or very few) shared data points within each data set

Results are more likely to be significantly different from each other

The difference is more likely to be lsquorealrsquo

-30

20

70

120

170

220

A colubris 159mm(n=10)

C latirostris 188mm(n=10)

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris(error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Our results show a very small overlap between the two sets of data

So how do we know if the difference is significant or not

We need to use a statistical test

The t-test is a statistical test that helps us determine the significance of the difference between the means of two sets of data

The Null Hypothesis (H0)

ldquoThere is no significant differencerdquo

This is the lsquodefaultrsquo hypothesis that we always testIn our conclusion we either accept the null hypothesis or reject it

A t-test can be used to test whether the difference between two means is significant bull If we accept H0 then the means are not significantly different bull If we reject H0 then the means are significantly different

Rememberbull We are never lsquotryingrsquo to get a difference We design carefully-controlled experiments and

then analyse the results using statistical analysis

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

What happens to the value of P as the confidence in the results increases

What happens to the critical value as the confidence level increases

ldquocritical valuesrdquo

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

We usually use Plt005 (95 confidence) in Biology as our data can be highly variable

Simple explanation we are working in two directions ndash within each population and across populations

ldquocritical valuesrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 4: Statistical Analysis

ldquoWhy is this BiologyrdquoVariation in populations

Variability in results

affects

Confidence in conclusions

The key methodology in Biology is hypothesis testing through experimentation

Carefully-designed and controlled experiments and surveys give us quantitative

(numeric) data that can be compared

We can use the data collected to test our hypothesis and form explanations of the

processes involvedhellip but only if we can be confident in our results

We therefore need to be able to evaluate the reliability of a set of data and the significance of any differences we have found in the data

Image Transverse section of part of a stem of a Dead-nettle (Lamium sp) showing+a+vascular+bundle+and+part+of+the+cortex httpwwwflickrcomphotos71183136N086959590092 Found on flickrccnet

ldquoWhich medicine should I prescriberdquo

Image from httpwwwmsforginternational-activity-report-2010-sierra-leoneDonate to Medecins Sans Friontiers through Biology4Good httpi-biologynetaboutbiology4good

ldquoWhich medicine should I prescriberdquo

Image from httpwwwmsforginternational-activity-report-2010-sierra-leoneDonate to Medecins Sans Friontiers through Biology4Good httpi-biologynetaboutbiology4good

Generic drugs are out-of-patent and are much cheaper than the proprietary (brand-name) equivalents Doctors need to balance needs with available resources Which would you choose

ldquoWhich medicine should I prescriberdquo

Image from httpwwwmsforginternational-activity-report-2010-sierra-leoneDonate to Medecins Sans Friontiers through Biology4Good httpi-biologynetaboutbiology4good

Means (averages) in Biology are almost never good enough Biological systems (and our results) show variability

Which would you choose now

Hummingbirds are nectarivores (herbivores that feed on the nectar of some species of flower)

In return for food they pollinate the flower This is an example of mutualism ndash benefit for all

As a result of natural selection hummingbird bills have evolved

Birds with a bill best suited to their preferred food source have

the greater chance of survival

Photo Archilochus colubris from wikimedia commons by Dick Daniels

Researchers studying comparative anatomy collect data on bill-length in two species of hummingbirds Archilochus colubris (red-throated hummingbird) and Cynanthus latirostris (broadbilled hummingbird)

To do this they need to collect sufficientrelevant reliable data so they can testthe Null hypothesis (H0) that

ldquothere is no significant difference in bill length between the two speciesrdquo

Photo Archilochus colubris (male) wikimedia commons by Joe Schneid

The sample size must be large enough to provide

sufficient reliable data and for us to carry out relevant statistical

tests for significance

We must also be mindful of uncertainty in our measuring tools

and error in our results

Photo Broadbilled hummingbird (wikimedia commons)

The mean is a measure of the central tendency of a set of data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean      s           

Calculate the mean using bull Your calculator (sum of values n)

bull Excel

=AVERAGE(highlight raw data)

n = sample size The bigger the better In this case n=10 for each group

All values should be centred in the cell with decimal places consistent with the measuring tool uncertainty

The mean is a measure of the central tendency of a set of data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s

       

Raw data and the mean need to have consistent decimal places (in line with uncertainty of the measuring tool)

Uncertainties must be included

Descriptive table title and number

DELETE

X

DELETE

X

00

20

40

60

80

100

120

140

160

180

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C latirostris

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Descriptive title with graph number

Labeled point

Y-axis clearly labeled with uncertainty

Make sure that the y-axis begins at zero

x-axis labeled

00

20

40

60

80

100

120

140

160

180

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C latirostris

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

From the means alone you might conclude that C latirostris has a longer bill than A colubris

But the mean only tells part of the story

httpclick4biologyinfoc4b1gcStathtm

httpmathbitscomMathBitsTINSectionStatistics1Spreadsheethtml

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

C latirostris

A colubris

Standard deviation is a measure of the spread of most of the data Error bars are a graphical representation of the variability of data

Which of the two sets of data has

a The highest mean

b The greatest variability in the data

A

B

Error bars could represent standard deviation range or confidence intervals

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Delete the horizontal error bars

00

50

100

150

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris (error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Title is adjusted to show the source of the error bars This is very important

You can see the clear difference in the size of the error bars

Variability has been visualised

The error bars overlap somewhat

What does this mean

The overlap of a set of error bars gives a clue as to the significance of the difference between two sets of data

Large overlap No overlap

Lots of shared data points within each data set

Results are not likely to be significantly different from each other

Any difference is most likely due to chance

No (or very few) shared data points within each data set

Results are more likely to be significantly different from each other

The difference is more likely to be lsquorealrsquo

-30

20

70

120

170

220

A colubris 159mm(n=10)

C latirostris 188mm(n=10)

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris(error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Our results show a very small overlap between the two sets of data

So how do we know if the difference is significant or not

We need to use a statistical test

The t-test is a statistical test that helps us determine the significance of the difference between the means of two sets of data

The Null Hypothesis (H0)

ldquoThere is no significant differencerdquo

This is the lsquodefaultrsquo hypothesis that we always testIn our conclusion we either accept the null hypothesis or reject it

A t-test can be used to test whether the difference between two means is significant bull If we accept H0 then the means are not significantly different bull If we reject H0 then the means are significantly different

Rememberbull We are never lsquotryingrsquo to get a difference We design carefully-controlled experiments and

then analyse the results using statistical analysis

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

What happens to the value of P as the confidence in the results increases

What happens to the critical value as the confidence level increases

ldquocritical valuesrdquo

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

We usually use Plt005 (95 confidence) in Biology as our data can be highly variable

Simple explanation we are working in two directions ndash within each population and across populations

ldquocritical valuesrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 5: Statistical Analysis

ldquoWhich medicine should I prescriberdquo

Image from httpwwwmsforginternational-activity-report-2010-sierra-leoneDonate to Medecins Sans Friontiers through Biology4Good httpi-biologynetaboutbiology4good

ldquoWhich medicine should I prescriberdquo

Image from httpwwwmsforginternational-activity-report-2010-sierra-leoneDonate to Medecins Sans Friontiers through Biology4Good httpi-biologynetaboutbiology4good

Generic drugs are out-of-patent and are much cheaper than the proprietary (brand-name) equivalents Doctors need to balance needs with available resources Which would you choose

ldquoWhich medicine should I prescriberdquo

Image from httpwwwmsforginternational-activity-report-2010-sierra-leoneDonate to Medecins Sans Friontiers through Biology4Good httpi-biologynetaboutbiology4good

Means (averages) in Biology are almost never good enough Biological systems (and our results) show variability

Which would you choose now

Hummingbirds are nectarivores (herbivores that feed on the nectar of some species of flower)

In return for food they pollinate the flower This is an example of mutualism ndash benefit for all

As a result of natural selection hummingbird bills have evolved

Birds with a bill best suited to their preferred food source have

the greater chance of survival

Photo Archilochus colubris from wikimedia commons by Dick Daniels

Researchers studying comparative anatomy collect data on bill-length in two species of hummingbirds Archilochus colubris (red-throated hummingbird) and Cynanthus latirostris (broadbilled hummingbird)

To do this they need to collect sufficientrelevant reliable data so they can testthe Null hypothesis (H0) that

ldquothere is no significant difference in bill length between the two speciesrdquo

Photo Archilochus colubris (male) wikimedia commons by Joe Schneid

The sample size must be large enough to provide

sufficient reliable data and for us to carry out relevant statistical

tests for significance

We must also be mindful of uncertainty in our measuring tools

and error in our results

Photo Broadbilled hummingbird (wikimedia commons)

The mean is a measure of the central tendency of a set of data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean      s           

Calculate the mean using bull Your calculator (sum of values n)

bull Excel

=AVERAGE(highlight raw data)

n = sample size The bigger the better In this case n=10 for each group

All values should be centred in the cell with decimal places consistent with the measuring tool uncertainty

The mean is a measure of the central tendency of a set of data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s

       

Raw data and the mean need to have consistent decimal places (in line with uncertainty of the measuring tool)

Uncertainties must be included

Descriptive table title and number

DELETE

X

DELETE

X

00

20

40

60

80

100

120

140

160

180

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C latirostris

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Descriptive title with graph number

Labeled point

Y-axis clearly labeled with uncertainty

Make sure that the y-axis begins at zero

x-axis labeled

00

20

40

60

80

100

120

140

160

180

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C latirostris

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

From the means alone you might conclude that C latirostris has a longer bill than A colubris

But the mean only tells part of the story

httpclick4biologyinfoc4b1gcStathtm

httpmathbitscomMathBitsTINSectionStatistics1Spreadsheethtml

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

C latirostris

A colubris

Standard deviation is a measure of the spread of most of the data Error bars are a graphical representation of the variability of data

Which of the two sets of data has

a The highest mean

b The greatest variability in the data

A

B

Error bars could represent standard deviation range or confidence intervals

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Delete the horizontal error bars

00

50

100

150

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris (error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Title is adjusted to show the source of the error bars This is very important

You can see the clear difference in the size of the error bars

Variability has been visualised

The error bars overlap somewhat

What does this mean

The overlap of a set of error bars gives a clue as to the significance of the difference between two sets of data

Large overlap No overlap

Lots of shared data points within each data set

Results are not likely to be significantly different from each other

Any difference is most likely due to chance

No (or very few) shared data points within each data set

Results are more likely to be significantly different from each other

The difference is more likely to be lsquorealrsquo

-30

20

70

120

170

220

A colubris 159mm(n=10)

C latirostris 188mm(n=10)

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris(error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Our results show a very small overlap between the two sets of data

So how do we know if the difference is significant or not

We need to use a statistical test

The t-test is a statistical test that helps us determine the significance of the difference between the means of two sets of data

The Null Hypothesis (H0)

ldquoThere is no significant differencerdquo

This is the lsquodefaultrsquo hypothesis that we always testIn our conclusion we either accept the null hypothesis or reject it

A t-test can be used to test whether the difference between two means is significant bull If we accept H0 then the means are not significantly different bull If we reject H0 then the means are significantly different

Rememberbull We are never lsquotryingrsquo to get a difference We design carefully-controlled experiments and

then analyse the results using statistical analysis

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

What happens to the value of P as the confidence in the results increases

What happens to the critical value as the confidence level increases

ldquocritical valuesrdquo

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

We usually use Plt005 (95 confidence) in Biology as our data can be highly variable

Simple explanation we are working in two directions ndash within each population and across populations

ldquocritical valuesrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 6: Statistical Analysis

ldquoWhich medicine should I prescriberdquo

Image from httpwwwmsforginternational-activity-report-2010-sierra-leoneDonate to Medecins Sans Friontiers through Biology4Good httpi-biologynetaboutbiology4good

Generic drugs are out-of-patent and are much cheaper than the proprietary (brand-name) equivalents Doctors need to balance needs with available resources Which would you choose

ldquoWhich medicine should I prescriberdquo

Image from httpwwwmsforginternational-activity-report-2010-sierra-leoneDonate to Medecins Sans Friontiers through Biology4Good httpi-biologynetaboutbiology4good

Means (averages) in Biology are almost never good enough Biological systems (and our results) show variability

Which would you choose now

Hummingbirds are nectarivores (herbivores that feed on the nectar of some species of flower)

In return for food they pollinate the flower This is an example of mutualism ndash benefit for all

As a result of natural selection hummingbird bills have evolved

Birds with a bill best suited to their preferred food source have

the greater chance of survival

Photo Archilochus colubris from wikimedia commons by Dick Daniels

Researchers studying comparative anatomy collect data on bill-length in two species of hummingbirds Archilochus colubris (red-throated hummingbird) and Cynanthus latirostris (broadbilled hummingbird)

To do this they need to collect sufficientrelevant reliable data so they can testthe Null hypothesis (H0) that

ldquothere is no significant difference in bill length between the two speciesrdquo

Photo Archilochus colubris (male) wikimedia commons by Joe Schneid

The sample size must be large enough to provide

sufficient reliable data and for us to carry out relevant statistical

tests for significance

We must also be mindful of uncertainty in our measuring tools

and error in our results

Photo Broadbilled hummingbird (wikimedia commons)

The mean is a measure of the central tendency of a set of data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean      s           

Calculate the mean using bull Your calculator (sum of values n)

bull Excel

=AVERAGE(highlight raw data)

n = sample size The bigger the better In this case n=10 for each group

All values should be centred in the cell with decimal places consistent with the measuring tool uncertainty

The mean is a measure of the central tendency of a set of data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s

       

Raw data and the mean need to have consistent decimal places (in line with uncertainty of the measuring tool)

Uncertainties must be included

Descriptive table title and number

DELETE

X

DELETE

X

00

20

40

60

80

100

120

140

160

180

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C latirostris

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Descriptive title with graph number

Labeled point

Y-axis clearly labeled with uncertainty

Make sure that the y-axis begins at zero

x-axis labeled

00

20

40

60

80

100

120

140

160

180

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C latirostris

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

From the means alone you might conclude that C latirostris has a longer bill than A colubris

But the mean only tells part of the story

httpclick4biologyinfoc4b1gcStathtm

httpmathbitscomMathBitsTINSectionStatistics1Spreadsheethtml

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

C latirostris

A colubris

Standard deviation is a measure of the spread of most of the data Error bars are a graphical representation of the variability of data

Which of the two sets of data has

a The highest mean

b The greatest variability in the data

A

B

Error bars could represent standard deviation range or confidence intervals

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Delete the horizontal error bars

00

50

100

150

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris (error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Title is adjusted to show the source of the error bars This is very important

You can see the clear difference in the size of the error bars

Variability has been visualised

The error bars overlap somewhat

What does this mean

The overlap of a set of error bars gives a clue as to the significance of the difference between two sets of data

Large overlap No overlap

Lots of shared data points within each data set

Results are not likely to be significantly different from each other

Any difference is most likely due to chance

No (or very few) shared data points within each data set

Results are more likely to be significantly different from each other

The difference is more likely to be lsquorealrsquo

-30

20

70

120

170

220

A colubris 159mm(n=10)

C latirostris 188mm(n=10)

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris(error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Our results show a very small overlap between the two sets of data

So how do we know if the difference is significant or not

We need to use a statistical test

The t-test is a statistical test that helps us determine the significance of the difference between the means of two sets of data

The Null Hypothesis (H0)

ldquoThere is no significant differencerdquo

This is the lsquodefaultrsquo hypothesis that we always testIn our conclusion we either accept the null hypothesis or reject it

A t-test can be used to test whether the difference between two means is significant bull If we accept H0 then the means are not significantly different bull If we reject H0 then the means are significantly different

Rememberbull We are never lsquotryingrsquo to get a difference We design carefully-controlled experiments and

then analyse the results using statistical analysis

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

What happens to the value of P as the confidence in the results increases

What happens to the critical value as the confidence level increases

ldquocritical valuesrdquo

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

We usually use Plt005 (95 confidence) in Biology as our data can be highly variable

Simple explanation we are working in two directions ndash within each population and across populations

ldquocritical valuesrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 7: Statistical Analysis

ldquoWhich medicine should I prescriberdquo

Image from httpwwwmsforginternational-activity-report-2010-sierra-leoneDonate to Medecins Sans Friontiers through Biology4Good httpi-biologynetaboutbiology4good

Means (averages) in Biology are almost never good enough Biological systems (and our results) show variability

Which would you choose now

Hummingbirds are nectarivores (herbivores that feed on the nectar of some species of flower)

In return for food they pollinate the flower This is an example of mutualism ndash benefit for all

As a result of natural selection hummingbird bills have evolved

Birds with a bill best suited to their preferred food source have

the greater chance of survival

Photo Archilochus colubris from wikimedia commons by Dick Daniels

Researchers studying comparative anatomy collect data on bill-length in two species of hummingbirds Archilochus colubris (red-throated hummingbird) and Cynanthus latirostris (broadbilled hummingbird)

To do this they need to collect sufficientrelevant reliable data so they can testthe Null hypothesis (H0) that

ldquothere is no significant difference in bill length between the two speciesrdquo

Photo Archilochus colubris (male) wikimedia commons by Joe Schneid

The sample size must be large enough to provide

sufficient reliable data and for us to carry out relevant statistical

tests for significance

We must also be mindful of uncertainty in our measuring tools

and error in our results

Photo Broadbilled hummingbird (wikimedia commons)

The mean is a measure of the central tendency of a set of data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean      s           

Calculate the mean using bull Your calculator (sum of values n)

bull Excel

=AVERAGE(highlight raw data)

n = sample size The bigger the better In this case n=10 for each group

All values should be centred in the cell with decimal places consistent with the measuring tool uncertainty

The mean is a measure of the central tendency of a set of data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s

       

Raw data and the mean need to have consistent decimal places (in line with uncertainty of the measuring tool)

Uncertainties must be included

Descriptive table title and number

DELETE

X

DELETE

X

00

20

40

60

80

100

120

140

160

180

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C latirostris

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Descriptive title with graph number

Labeled point

Y-axis clearly labeled with uncertainty

Make sure that the y-axis begins at zero

x-axis labeled

00

20

40

60

80

100

120

140

160

180

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C latirostris

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

From the means alone you might conclude that C latirostris has a longer bill than A colubris

But the mean only tells part of the story

httpclick4biologyinfoc4b1gcStathtm

httpmathbitscomMathBitsTINSectionStatistics1Spreadsheethtml

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

C latirostris

A colubris

Standard deviation is a measure of the spread of most of the data Error bars are a graphical representation of the variability of data

Which of the two sets of data has

a The highest mean

b The greatest variability in the data

A

B

Error bars could represent standard deviation range or confidence intervals

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Delete the horizontal error bars

00

50

100

150

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris (error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Title is adjusted to show the source of the error bars This is very important

You can see the clear difference in the size of the error bars

Variability has been visualised

The error bars overlap somewhat

What does this mean

The overlap of a set of error bars gives a clue as to the significance of the difference between two sets of data

Large overlap No overlap

Lots of shared data points within each data set

Results are not likely to be significantly different from each other

Any difference is most likely due to chance

No (or very few) shared data points within each data set

Results are more likely to be significantly different from each other

The difference is more likely to be lsquorealrsquo

-30

20

70

120

170

220

A colubris 159mm(n=10)

C latirostris 188mm(n=10)

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris(error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Our results show a very small overlap between the two sets of data

So how do we know if the difference is significant or not

We need to use a statistical test

The t-test is a statistical test that helps us determine the significance of the difference between the means of two sets of data

The Null Hypothesis (H0)

ldquoThere is no significant differencerdquo

This is the lsquodefaultrsquo hypothesis that we always testIn our conclusion we either accept the null hypothesis or reject it

A t-test can be used to test whether the difference between two means is significant bull If we accept H0 then the means are not significantly different bull If we reject H0 then the means are significantly different

Rememberbull We are never lsquotryingrsquo to get a difference We design carefully-controlled experiments and

then analyse the results using statistical analysis

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

What happens to the value of P as the confidence in the results increases

What happens to the critical value as the confidence level increases

ldquocritical valuesrdquo

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

We usually use Plt005 (95 confidence) in Biology as our data can be highly variable

Simple explanation we are working in two directions ndash within each population and across populations

ldquocritical valuesrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 8: Statistical Analysis

Hummingbirds are nectarivores (herbivores that feed on the nectar of some species of flower)

In return for food they pollinate the flower This is an example of mutualism ndash benefit for all

As a result of natural selection hummingbird bills have evolved

Birds with a bill best suited to their preferred food source have

the greater chance of survival

Photo Archilochus colubris from wikimedia commons by Dick Daniels

Researchers studying comparative anatomy collect data on bill-length in two species of hummingbirds Archilochus colubris (red-throated hummingbird) and Cynanthus latirostris (broadbilled hummingbird)

To do this they need to collect sufficientrelevant reliable data so they can testthe Null hypothesis (H0) that

ldquothere is no significant difference in bill length between the two speciesrdquo

Photo Archilochus colubris (male) wikimedia commons by Joe Schneid

The sample size must be large enough to provide

sufficient reliable data and for us to carry out relevant statistical

tests for significance

We must also be mindful of uncertainty in our measuring tools

and error in our results

Photo Broadbilled hummingbird (wikimedia commons)

The mean is a measure of the central tendency of a set of data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean      s           

Calculate the mean using bull Your calculator (sum of values n)

bull Excel

=AVERAGE(highlight raw data)

n = sample size The bigger the better In this case n=10 for each group

All values should be centred in the cell with decimal places consistent with the measuring tool uncertainty

The mean is a measure of the central tendency of a set of data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s

       

Raw data and the mean need to have consistent decimal places (in line with uncertainty of the measuring tool)

Uncertainties must be included

Descriptive table title and number

DELETE

X

DELETE

X

00

20

40

60

80

100

120

140

160

180

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C latirostris

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Descriptive title with graph number

Labeled point

Y-axis clearly labeled with uncertainty

Make sure that the y-axis begins at zero

x-axis labeled

00

20

40

60

80

100

120

140

160

180

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C latirostris

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

From the means alone you might conclude that C latirostris has a longer bill than A colubris

But the mean only tells part of the story

httpclick4biologyinfoc4b1gcStathtm

httpmathbitscomMathBitsTINSectionStatistics1Spreadsheethtml

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

C latirostris

A colubris

Standard deviation is a measure of the spread of most of the data Error bars are a graphical representation of the variability of data

Which of the two sets of data has

a The highest mean

b The greatest variability in the data

A

B

Error bars could represent standard deviation range or confidence intervals

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Delete the horizontal error bars

00

50

100

150

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris (error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Title is adjusted to show the source of the error bars This is very important

You can see the clear difference in the size of the error bars

Variability has been visualised

The error bars overlap somewhat

What does this mean

The overlap of a set of error bars gives a clue as to the significance of the difference between two sets of data

Large overlap No overlap

Lots of shared data points within each data set

Results are not likely to be significantly different from each other

Any difference is most likely due to chance

No (or very few) shared data points within each data set

Results are more likely to be significantly different from each other

The difference is more likely to be lsquorealrsquo

-30

20

70

120

170

220

A colubris 159mm(n=10)

C latirostris 188mm(n=10)

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris(error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Our results show a very small overlap between the two sets of data

So how do we know if the difference is significant or not

We need to use a statistical test

The t-test is a statistical test that helps us determine the significance of the difference between the means of two sets of data

The Null Hypothesis (H0)

ldquoThere is no significant differencerdquo

This is the lsquodefaultrsquo hypothesis that we always testIn our conclusion we either accept the null hypothesis or reject it

A t-test can be used to test whether the difference between two means is significant bull If we accept H0 then the means are not significantly different bull If we reject H0 then the means are significantly different

Rememberbull We are never lsquotryingrsquo to get a difference We design carefully-controlled experiments and

then analyse the results using statistical analysis

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

What happens to the value of P as the confidence in the results increases

What happens to the critical value as the confidence level increases

ldquocritical valuesrdquo

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

We usually use Plt005 (95 confidence) in Biology as our data can be highly variable

Simple explanation we are working in two directions ndash within each population and across populations

ldquocritical valuesrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 9: Statistical Analysis

Researchers studying comparative anatomy collect data on bill-length in two species of hummingbirds Archilochus colubris (red-throated hummingbird) and Cynanthus latirostris (broadbilled hummingbird)

To do this they need to collect sufficientrelevant reliable data so they can testthe Null hypothesis (H0) that

ldquothere is no significant difference in bill length between the two speciesrdquo

Photo Archilochus colubris (male) wikimedia commons by Joe Schneid

The sample size must be large enough to provide

sufficient reliable data and for us to carry out relevant statistical

tests for significance

We must also be mindful of uncertainty in our measuring tools

and error in our results

Photo Broadbilled hummingbird (wikimedia commons)

The mean is a measure of the central tendency of a set of data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean      s           

Calculate the mean using bull Your calculator (sum of values n)

bull Excel

=AVERAGE(highlight raw data)

n = sample size The bigger the better In this case n=10 for each group

All values should be centred in the cell with decimal places consistent with the measuring tool uncertainty

The mean is a measure of the central tendency of a set of data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s

       

Raw data and the mean need to have consistent decimal places (in line with uncertainty of the measuring tool)

Uncertainties must be included

Descriptive table title and number

DELETE

X

DELETE

X

00

20

40

60

80

100

120

140

160

180

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C latirostris

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Descriptive title with graph number

Labeled point

Y-axis clearly labeled with uncertainty

Make sure that the y-axis begins at zero

x-axis labeled

00

20

40

60

80

100

120

140

160

180

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C latirostris

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

From the means alone you might conclude that C latirostris has a longer bill than A colubris

But the mean only tells part of the story

httpclick4biologyinfoc4b1gcStathtm

httpmathbitscomMathBitsTINSectionStatistics1Spreadsheethtml

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

C latirostris

A colubris

Standard deviation is a measure of the spread of most of the data Error bars are a graphical representation of the variability of data

Which of the two sets of data has

a The highest mean

b The greatest variability in the data

A

B

Error bars could represent standard deviation range or confidence intervals

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Delete the horizontal error bars

00

50

100

150

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris (error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Title is adjusted to show the source of the error bars This is very important

You can see the clear difference in the size of the error bars

Variability has been visualised

The error bars overlap somewhat

What does this mean

The overlap of a set of error bars gives a clue as to the significance of the difference between two sets of data

Large overlap No overlap

Lots of shared data points within each data set

Results are not likely to be significantly different from each other

Any difference is most likely due to chance

No (or very few) shared data points within each data set

Results are more likely to be significantly different from each other

The difference is more likely to be lsquorealrsquo

-30

20

70

120

170

220

A colubris 159mm(n=10)

C latirostris 188mm(n=10)

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris(error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Our results show a very small overlap between the two sets of data

So how do we know if the difference is significant or not

We need to use a statistical test

The t-test is a statistical test that helps us determine the significance of the difference between the means of two sets of data

The Null Hypothesis (H0)

ldquoThere is no significant differencerdquo

This is the lsquodefaultrsquo hypothesis that we always testIn our conclusion we either accept the null hypothesis or reject it

A t-test can be used to test whether the difference between two means is significant bull If we accept H0 then the means are not significantly different bull If we reject H0 then the means are significantly different

Rememberbull We are never lsquotryingrsquo to get a difference We design carefully-controlled experiments and

then analyse the results using statistical analysis

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

What happens to the value of P as the confidence in the results increases

What happens to the critical value as the confidence level increases

ldquocritical valuesrdquo

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

We usually use Plt005 (95 confidence) in Biology as our data can be highly variable

Simple explanation we are working in two directions ndash within each population and across populations

ldquocritical valuesrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 10: Statistical Analysis

The sample size must be large enough to provide

sufficient reliable data and for us to carry out relevant statistical

tests for significance

We must also be mindful of uncertainty in our measuring tools

and error in our results

Photo Broadbilled hummingbird (wikimedia commons)

The mean is a measure of the central tendency of a set of data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean      s           

Calculate the mean using bull Your calculator (sum of values n)

bull Excel

=AVERAGE(highlight raw data)

n = sample size The bigger the better In this case n=10 for each group

All values should be centred in the cell with decimal places consistent with the measuring tool uncertainty

The mean is a measure of the central tendency of a set of data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s

       

Raw data and the mean need to have consistent decimal places (in line with uncertainty of the measuring tool)

Uncertainties must be included

Descriptive table title and number

DELETE

X

DELETE

X

00

20

40

60

80

100

120

140

160

180

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C latirostris

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Descriptive title with graph number

Labeled point

Y-axis clearly labeled with uncertainty

Make sure that the y-axis begins at zero

x-axis labeled

00

20

40

60

80

100

120

140

160

180

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C latirostris

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

From the means alone you might conclude that C latirostris has a longer bill than A colubris

But the mean only tells part of the story

httpclick4biologyinfoc4b1gcStathtm

httpmathbitscomMathBitsTINSectionStatistics1Spreadsheethtml

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

C latirostris

A colubris

Standard deviation is a measure of the spread of most of the data Error bars are a graphical representation of the variability of data

Which of the two sets of data has

a The highest mean

b The greatest variability in the data

A

B

Error bars could represent standard deviation range or confidence intervals

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Delete the horizontal error bars

00

50

100

150

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris (error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Title is adjusted to show the source of the error bars This is very important

You can see the clear difference in the size of the error bars

Variability has been visualised

The error bars overlap somewhat

What does this mean

The overlap of a set of error bars gives a clue as to the significance of the difference between two sets of data

Large overlap No overlap

Lots of shared data points within each data set

Results are not likely to be significantly different from each other

Any difference is most likely due to chance

No (or very few) shared data points within each data set

Results are more likely to be significantly different from each other

The difference is more likely to be lsquorealrsquo

-30

20

70

120

170

220

A colubris 159mm(n=10)

C latirostris 188mm(n=10)

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris(error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Our results show a very small overlap between the two sets of data

So how do we know if the difference is significant or not

We need to use a statistical test

The t-test is a statistical test that helps us determine the significance of the difference between the means of two sets of data

The Null Hypothesis (H0)

ldquoThere is no significant differencerdquo

This is the lsquodefaultrsquo hypothesis that we always testIn our conclusion we either accept the null hypothesis or reject it

A t-test can be used to test whether the difference between two means is significant bull If we accept H0 then the means are not significantly different bull If we reject H0 then the means are significantly different

Rememberbull We are never lsquotryingrsquo to get a difference We design carefully-controlled experiments and

then analyse the results using statistical analysis

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

What happens to the value of P as the confidence in the results increases

What happens to the critical value as the confidence level increases

ldquocritical valuesrdquo

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

We usually use Plt005 (95 confidence) in Biology as our data can be highly variable

Simple explanation we are working in two directions ndash within each population and across populations

ldquocritical valuesrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 11: Statistical Analysis

The mean is a measure of the central tendency of a set of data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean      s           

Calculate the mean using bull Your calculator (sum of values n)

bull Excel

=AVERAGE(highlight raw data)

n = sample size The bigger the better In this case n=10 for each group

All values should be centred in the cell with decimal places consistent with the measuring tool uncertainty

The mean is a measure of the central tendency of a set of data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s

       

Raw data and the mean need to have consistent decimal places (in line with uncertainty of the measuring tool)

Uncertainties must be included

Descriptive table title and number

DELETE

X

DELETE

X

00

20

40

60

80

100

120

140

160

180

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C latirostris

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Descriptive title with graph number

Labeled point

Y-axis clearly labeled with uncertainty

Make sure that the y-axis begins at zero

x-axis labeled

00

20

40

60

80

100

120

140

160

180

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C latirostris

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

From the means alone you might conclude that C latirostris has a longer bill than A colubris

But the mean only tells part of the story

httpclick4biologyinfoc4b1gcStathtm

httpmathbitscomMathBitsTINSectionStatistics1Spreadsheethtml

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

C latirostris

A colubris

Standard deviation is a measure of the spread of most of the data Error bars are a graphical representation of the variability of data

Which of the two sets of data has

a The highest mean

b The greatest variability in the data

A

B

Error bars could represent standard deviation range or confidence intervals

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Delete the horizontal error bars

00

50

100

150

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris (error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Title is adjusted to show the source of the error bars This is very important

You can see the clear difference in the size of the error bars

Variability has been visualised

The error bars overlap somewhat

What does this mean

The overlap of a set of error bars gives a clue as to the significance of the difference between two sets of data

Large overlap No overlap

Lots of shared data points within each data set

Results are not likely to be significantly different from each other

Any difference is most likely due to chance

No (or very few) shared data points within each data set

Results are more likely to be significantly different from each other

The difference is more likely to be lsquorealrsquo

-30

20

70

120

170

220

A colubris 159mm(n=10)

C latirostris 188mm(n=10)

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris(error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Our results show a very small overlap between the two sets of data

So how do we know if the difference is significant or not

We need to use a statistical test

The t-test is a statistical test that helps us determine the significance of the difference between the means of two sets of data

The Null Hypothesis (H0)

ldquoThere is no significant differencerdquo

This is the lsquodefaultrsquo hypothesis that we always testIn our conclusion we either accept the null hypothesis or reject it

A t-test can be used to test whether the difference between two means is significant bull If we accept H0 then the means are not significantly different bull If we reject H0 then the means are significantly different

Rememberbull We are never lsquotryingrsquo to get a difference We design carefully-controlled experiments and

then analyse the results using statistical analysis

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

What happens to the value of P as the confidence in the results increases

What happens to the critical value as the confidence level increases

ldquocritical valuesrdquo

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

We usually use Plt005 (95 confidence) in Biology as our data can be highly variable

Simple explanation we are working in two directions ndash within each population and across populations

ldquocritical valuesrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 12: Statistical Analysis

The mean is a measure of the central tendency of a set of data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s

       

Raw data and the mean need to have consistent decimal places (in line with uncertainty of the measuring tool)

Uncertainties must be included

Descriptive table title and number

DELETE

X

DELETE

X

00

20

40

60

80

100

120

140

160

180

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C latirostris

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Descriptive title with graph number

Labeled point

Y-axis clearly labeled with uncertainty

Make sure that the y-axis begins at zero

x-axis labeled

00

20

40

60

80

100

120

140

160

180

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C latirostris

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

From the means alone you might conclude that C latirostris has a longer bill than A colubris

But the mean only tells part of the story

httpclick4biologyinfoc4b1gcStathtm

httpmathbitscomMathBitsTINSectionStatistics1Spreadsheethtml

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

C latirostris

A colubris

Standard deviation is a measure of the spread of most of the data Error bars are a graphical representation of the variability of data

Which of the two sets of data has

a The highest mean

b The greatest variability in the data

A

B

Error bars could represent standard deviation range or confidence intervals

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Delete the horizontal error bars

00

50

100

150

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris (error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Title is adjusted to show the source of the error bars This is very important

You can see the clear difference in the size of the error bars

Variability has been visualised

The error bars overlap somewhat

What does this mean

The overlap of a set of error bars gives a clue as to the significance of the difference between two sets of data

Large overlap No overlap

Lots of shared data points within each data set

Results are not likely to be significantly different from each other

Any difference is most likely due to chance

No (or very few) shared data points within each data set

Results are more likely to be significantly different from each other

The difference is more likely to be lsquorealrsquo

-30

20

70

120

170

220

A colubris 159mm(n=10)

C latirostris 188mm(n=10)

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris(error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Our results show a very small overlap between the two sets of data

So how do we know if the difference is significant or not

We need to use a statistical test

The t-test is a statistical test that helps us determine the significance of the difference between the means of two sets of data

The Null Hypothesis (H0)

ldquoThere is no significant differencerdquo

This is the lsquodefaultrsquo hypothesis that we always testIn our conclusion we either accept the null hypothesis or reject it

A t-test can be used to test whether the difference between two means is significant bull If we accept H0 then the means are not significantly different bull If we reject H0 then the means are significantly different

Rememberbull We are never lsquotryingrsquo to get a difference We design carefully-controlled experiments and

then analyse the results using statistical analysis

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

What happens to the value of P as the confidence in the results increases

What happens to the critical value as the confidence level increases

ldquocritical valuesrdquo

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

We usually use Plt005 (95 confidence) in Biology as our data can be highly variable

Simple explanation we are working in two directions ndash within each population and across populations

ldquocritical valuesrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 13: Statistical Analysis

DELETE

X

DELETE

X

00

20

40

60

80

100

120

140

160

180

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C latirostris

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Descriptive title with graph number

Labeled point

Y-axis clearly labeled with uncertainty

Make sure that the y-axis begins at zero

x-axis labeled

00

20

40

60

80

100

120

140

160

180

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C latirostris

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

From the means alone you might conclude that C latirostris has a longer bill than A colubris

But the mean only tells part of the story

httpclick4biologyinfoc4b1gcStathtm

httpmathbitscomMathBitsTINSectionStatistics1Spreadsheethtml

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

C latirostris

A colubris

Standard deviation is a measure of the spread of most of the data Error bars are a graphical representation of the variability of data

Which of the two sets of data has

a The highest mean

b The greatest variability in the data

A

B

Error bars could represent standard deviation range or confidence intervals

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Delete the horizontal error bars

00

50

100

150

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris (error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Title is adjusted to show the source of the error bars This is very important

You can see the clear difference in the size of the error bars

Variability has been visualised

The error bars overlap somewhat

What does this mean

The overlap of a set of error bars gives a clue as to the significance of the difference between two sets of data

Large overlap No overlap

Lots of shared data points within each data set

Results are not likely to be significantly different from each other

Any difference is most likely due to chance

No (or very few) shared data points within each data set

Results are more likely to be significantly different from each other

The difference is more likely to be lsquorealrsquo

-30

20

70

120

170

220

A colubris 159mm(n=10)

C latirostris 188mm(n=10)

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris(error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Our results show a very small overlap between the two sets of data

So how do we know if the difference is significant or not

We need to use a statistical test

The t-test is a statistical test that helps us determine the significance of the difference between the means of two sets of data

The Null Hypothesis (H0)

ldquoThere is no significant differencerdquo

This is the lsquodefaultrsquo hypothesis that we always testIn our conclusion we either accept the null hypothesis or reject it

A t-test can be used to test whether the difference between two means is significant bull If we accept H0 then the means are not significantly different bull If we reject H0 then the means are significantly different

Rememberbull We are never lsquotryingrsquo to get a difference We design carefully-controlled experiments and

then analyse the results using statistical analysis

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

What happens to the value of P as the confidence in the results increases

What happens to the critical value as the confidence level increases

ldquocritical valuesrdquo

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

We usually use Plt005 (95 confidence) in Biology as our data can be highly variable

Simple explanation we are working in two directions ndash within each population and across populations

ldquocritical valuesrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 14: Statistical Analysis

00

20

40

60

80

100

120

140

160

180

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C latirostris

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Descriptive title with graph number

Labeled point

Y-axis clearly labeled with uncertainty

Make sure that the y-axis begins at zero

x-axis labeled

00

20

40

60

80

100

120

140

160

180

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C latirostris

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

From the means alone you might conclude that C latirostris has a longer bill than A colubris

But the mean only tells part of the story

httpclick4biologyinfoc4b1gcStathtm

httpmathbitscomMathBitsTINSectionStatistics1Spreadsheethtml

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

C latirostris

A colubris

Standard deviation is a measure of the spread of most of the data Error bars are a graphical representation of the variability of data

Which of the two sets of data has

a The highest mean

b The greatest variability in the data

A

B

Error bars could represent standard deviation range or confidence intervals

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Delete the horizontal error bars

00

50

100

150

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris (error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Title is adjusted to show the source of the error bars This is very important

You can see the clear difference in the size of the error bars

Variability has been visualised

The error bars overlap somewhat

What does this mean

The overlap of a set of error bars gives a clue as to the significance of the difference between two sets of data

Large overlap No overlap

Lots of shared data points within each data set

Results are not likely to be significantly different from each other

Any difference is most likely due to chance

No (or very few) shared data points within each data set

Results are more likely to be significantly different from each other

The difference is more likely to be lsquorealrsquo

-30

20

70

120

170

220

A colubris 159mm(n=10)

C latirostris 188mm(n=10)

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris(error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Our results show a very small overlap between the two sets of data

So how do we know if the difference is significant or not

We need to use a statistical test

The t-test is a statistical test that helps us determine the significance of the difference between the means of two sets of data

The Null Hypothesis (H0)

ldquoThere is no significant differencerdquo

This is the lsquodefaultrsquo hypothesis that we always testIn our conclusion we either accept the null hypothesis or reject it

A t-test can be used to test whether the difference between two means is significant bull If we accept H0 then the means are not significantly different bull If we reject H0 then the means are significantly different

Rememberbull We are never lsquotryingrsquo to get a difference We design carefully-controlled experiments and

then analyse the results using statistical analysis

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

What happens to the value of P as the confidence in the results increases

What happens to the critical value as the confidence level increases

ldquocritical valuesrdquo

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

We usually use Plt005 (95 confidence) in Biology as our data can be highly variable

Simple explanation we are working in two directions ndash within each population and across populations

ldquocritical valuesrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 15: Statistical Analysis

00

20

40

60

80

100

120

140

160

180

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C latirostris

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

From the means alone you might conclude that C latirostris has a longer bill than A colubris

But the mean only tells part of the story

httpclick4biologyinfoc4b1gcStathtm

httpmathbitscomMathBitsTINSectionStatistics1Spreadsheethtml

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

C latirostris

A colubris

Standard deviation is a measure of the spread of most of the data Error bars are a graphical representation of the variability of data

Which of the two sets of data has

a The highest mean

b The greatest variability in the data

A

B

Error bars could represent standard deviation range or confidence intervals

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Delete the horizontal error bars

00

50

100

150

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris (error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Title is adjusted to show the source of the error bars This is very important

You can see the clear difference in the size of the error bars

Variability has been visualised

The error bars overlap somewhat

What does this mean

The overlap of a set of error bars gives a clue as to the significance of the difference between two sets of data

Large overlap No overlap

Lots of shared data points within each data set

Results are not likely to be significantly different from each other

Any difference is most likely due to chance

No (or very few) shared data points within each data set

Results are more likely to be significantly different from each other

The difference is more likely to be lsquorealrsquo

-30

20

70

120

170

220

A colubris 159mm(n=10)

C latirostris 188mm(n=10)

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris(error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Our results show a very small overlap between the two sets of data

So how do we know if the difference is significant or not

We need to use a statistical test

The t-test is a statistical test that helps us determine the significance of the difference between the means of two sets of data

The Null Hypothesis (H0)

ldquoThere is no significant differencerdquo

This is the lsquodefaultrsquo hypothesis that we always testIn our conclusion we either accept the null hypothesis or reject it

A t-test can be used to test whether the difference between two means is significant bull If we accept H0 then the means are not significantly different bull If we reject H0 then the means are significantly different

Rememberbull We are never lsquotryingrsquo to get a difference We design carefully-controlled experiments and

then analyse the results using statistical analysis

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

What happens to the value of P as the confidence in the results increases

What happens to the critical value as the confidence level increases

ldquocritical valuesrdquo

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

We usually use Plt005 (95 confidence) in Biology as our data can be highly variable

Simple explanation we are working in two directions ndash within each population and across populations

ldquocritical valuesrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 16: Statistical Analysis

httpclick4biologyinfoc4b1gcStathtm

httpmathbitscomMathBitsTINSectionStatistics1Spreadsheethtml

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

C latirostris

A colubris

Standard deviation is a measure of the spread of most of the data Error bars are a graphical representation of the variability of data

Which of the two sets of data has

a The highest mean

b The greatest variability in the data

A

B

Error bars could represent standard deviation range or confidence intervals

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Delete the horizontal error bars

00

50

100

150

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris (error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Title is adjusted to show the source of the error bars This is very important

You can see the clear difference in the size of the error bars

Variability has been visualised

The error bars overlap somewhat

What does this mean

The overlap of a set of error bars gives a clue as to the significance of the difference between two sets of data

Large overlap No overlap

Lots of shared data points within each data set

Results are not likely to be significantly different from each other

Any difference is most likely due to chance

No (or very few) shared data points within each data set

Results are more likely to be significantly different from each other

The difference is more likely to be lsquorealrsquo

-30

20

70

120

170

220

A colubris 159mm(n=10)

C latirostris 188mm(n=10)

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris(error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Our results show a very small overlap between the two sets of data

So how do we know if the difference is significant or not

We need to use a statistical test

The t-test is a statistical test that helps us determine the significance of the difference between the means of two sets of data

The Null Hypothesis (H0)

ldquoThere is no significant differencerdquo

This is the lsquodefaultrsquo hypothesis that we always testIn our conclusion we either accept the null hypothesis or reject it

A t-test can be used to test whether the difference between two means is significant bull If we accept H0 then the means are not significantly different bull If we reject H0 then the means are significantly different

Rememberbull We are never lsquotryingrsquo to get a difference We design carefully-controlled experiments and

then analyse the results using statistical analysis

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

What happens to the value of P as the confidence in the results increases

What happens to the critical value as the confidence level increases

ldquocritical valuesrdquo

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

We usually use Plt005 (95 confidence) in Biology as our data can be highly variable

Simple explanation we are working in two directions ndash within each population and across populations

ldquocritical valuesrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 17: Statistical Analysis

httpmathbitscomMathBitsTINSectionStatistics1Spreadsheethtml

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

C latirostris

A colubris

Standard deviation is a measure of the spread of most of the data Error bars are a graphical representation of the variability of data

Which of the two sets of data has

a The highest mean

b The greatest variability in the data

A

B

Error bars could represent standard deviation range or confidence intervals

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Delete the horizontal error bars

00

50

100

150

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris (error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Title is adjusted to show the source of the error bars This is very important

You can see the clear difference in the size of the error bars

Variability has been visualised

The error bars overlap somewhat

What does this mean

The overlap of a set of error bars gives a clue as to the significance of the difference between two sets of data

Large overlap No overlap

Lots of shared data points within each data set

Results are not likely to be significantly different from each other

Any difference is most likely due to chance

No (or very few) shared data points within each data set

Results are more likely to be significantly different from each other

The difference is more likely to be lsquorealrsquo

-30

20

70

120

170

220

A colubris 159mm(n=10)

C latirostris 188mm(n=10)

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris(error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Our results show a very small overlap between the two sets of data

So how do we know if the difference is significant or not

We need to use a statistical test

The t-test is a statistical test that helps us determine the significance of the difference between the means of two sets of data

The Null Hypothesis (H0)

ldquoThere is no significant differencerdquo

This is the lsquodefaultrsquo hypothesis that we always testIn our conclusion we either accept the null hypothesis or reject it

A t-test can be used to test whether the difference between two means is significant bull If we accept H0 then the means are not significantly different bull If we reject H0 then the means are significantly different

Rememberbull We are never lsquotryingrsquo to get a difference We design carefully-controlled experiments and

then analyse the results using statistical analysis

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

What happens to the value of P as the confidence in the results increases

What happens to the critical value as the confidence level increases

ldquocritical valuesrdquo

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

We usually use Plt005 (95 confidence) in Biology as our data can be highly variable

Simple explanation we are working in two directions ndash within each population and across populations

ldquocritical valuesrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 18: Statistical Analysis

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

C latirostris

A colubris

Standard deviation is a measure of the spread of most of the data Error bars are a graphical representation of the variability of data

Which of the two sets of data has

a The highest mean

b The greatest variability in the data

A

B

Error bars could represent standard deviation range or confidence intervals

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Delete the horizontal error bars

00

50

100

150

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris (error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Title is adjusted to show the source of the error bars This is very important

You can see the clear difference in the size of the error bars

Variability has been visualised

The error bars overlap somewhat

What does this mean

The overlap of a set of error bars gives a clue as to the significance of the difference between two sets of data

Large overlap No overlap

Lots of shared data points within each data set

Results are not likely to be significantly different from each other

Any difference is most likely due to chance

No (or very few) shared data points within each data set

Results are more likely to be significantly different from each other

The difference is more likely to be lsquorealrsquo

-30

20

70

120

170

220

A colubris 159mm(n=10)

C latirostris 188mm(n=10)

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris(error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Our results show a very small overlap between the two sets of data

So how do we know if the difference is significant or not

We need to use a statistical test

The t-test is a statistical test that helps us determine the significance of the difference between the means of two sets of data

The Null Hypothesis (H0)

ldquoThere is no significant differencerdquo

This is the lsquodefaultrsquo hypothesis that we always testIn our conclusion we either accept the null hypothesis or reject it

A t-test can be used to test whether the difference between two means is significant bull If we accept H0 then the means are not significantly different bull If we reject H0 then the means are significantly different

Rememberbull We are never lsquotryingrsquo to get a difference We design carefully-controlled experiments and

then analyse the results using statistical analysis

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

What happens to the value of P as the confidence in the results increases

What happens to the critical value as the confidence level increases

ldquocritical valuesrdquo

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

We usually use Plt005 (95 confidence) in Biology as our data can be highly variable

Simple explanation we are working in two directions ndash within each population and across populations

ldquocritical valuesrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 19: Statistical Analysis

Standard deviation is a measure of the spread of most of the data

 

Table 1 Raw measurements of bill length in A colubris and C latirostris     Bill length (plusmn01mm)   n A colubris C latirostris

  1 130 170

  2 140 180

  3 150 180

  4 150 180

  5 150 190

  6 160 190

  7 160 190

  8 180 200

  9 180 200

  10 190 200

 Mean 159 188   s 191 103        

Standard deviation can have one more decimal place =STDEV (highlight RAW data)

Which of the two sets of data has

a The longest mean bill length

b The greatest variability in the data

C latirostris

A colubris

Standard deviation is a measure of the spread of most of the data Error bars are a graphical representation of the variability of data

Which of the two sets of data has

a The highest mean

b The greatest variability in the data

A

B

Error bars could represent standard deviation range or confidence intervals

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Delete the horizontal error bars

00

50

100

150

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris (error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Title is adjusted to show the source of the error bars This is very important

You can see the clear difference in the size of the error bars

Variability has been visualised

The error bars overlap somewhat

What does this mean

The overlap of a set of error bars gives a clue as to the significance of the difference between two sets of data

Large overlap No overlap

Lots of shared data points within each data set

Results are not likely to be significantly different from each other

Any difference is most likely due to chance

No (or very few) shared data points within each data set

Results are more likely to be significantly different from each other

The difference is more likely to be lsquorealrsquo

-30

20

70

120

170

220

A colubris 159mm(n=10)

C latirostris 188mm(n=10)

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris(error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Our results show a very small overlap between the two sets of data

So how do we know if the difference is significant or not

We need to use a statistical test

The t-test is a statistical test that helps us determine the significance of the difference between the means of two sets of data

The Null Hypothesis (H0)

ldquoThere is no significant differencerdquo

This is the lsquodefaultrsquo hypothesis that we always testIn our conclusion we either accept the null hypothesis or reject it

A t-test can be used to test whether the difference between two means is significant bull If we accept H0 then the means are not significantly different bull If we reject H0 then the means are significantly different

Rememberbull We are never lsquotryingrsquo to get a difference We design carefully-controlled experiments and

then analyse the results using statistical analysis

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

What happens to the value of P as the confidence in the results increases

What happens to the critical value as the confidence level increases

ldquocritical valuesrdquo

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

We usually use Plt005 (95 confidence) in Biology as our data can be highly variable

Simple explanation we are working in two directions ndash within each population and across populations

ldquocritical valuesrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 20: Statistical Analysis

Standard deviation is a measure of the spread of most of the data Error bars are a graphical representation of the variability of data

Which of the two sets of data has

a The highest mean

b The greatest variability in the data

A

B

Error bars could represent standard deviation range or confidence intervals

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Delete the horizontal error bars

00

50

100

150

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris (error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Title is adjusted to show the source of the error bars This is very important

You can see the clear difference in the size of the error bars

Variability has been visualised

The error bars overlap somewhat

What does this mean

The overlap of a set of error bars gives a clue as to the significance of the difference between two sets of data

Large overlap No overlap

Lots of shared data points within each data set

Results are not likely to be significantly different from each other

Any difference is most likely due to chance

No (or very few) shared data points within each data set

Results are more likely to be significantly different from each other

The difference is more likely to be lsquorealrsquo

-30

20

70

120

170

220

A colubris 159mm(n=10)

C latirostris 188mm(n=10)

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris(error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Our results show a very small overlap between the two sets of data

So how do we know if the difference is significant or not

We need to use a statistical test

The t-test is a statistical test that helps us determine the significance of the difference between the means of two sets of data

The Null Hypothesis (H0)

ldquoThere is no significant differencerdquo

This is the lsquodefaultrsquo hypothesis that we always testIn our conclusion we either accept the null hypothesis or reject it

A t-test can be used to test whether the difference between two means is significant bull If we accept H0 then the means are not significantly different bull If we reject H0 then the means are significantly different

Rememberbull We are never lsquotryingrsquo to get a difference We design carefully-controlled experiments and

then analyse the results using statistical analysis

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

What happens to the value of P as the confidence in the results increases

What happens to the critical value as the confidence level increases

ldquocritical valuesrdquo

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

We usually use Plt005 (95 confidence) in Biology as our data can be highly variable

Simple explanation we are working in two directions ndash within each population and across populations

ldquocritical valuesrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 21: Statistical Analysis

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Delete the horizontal error bars

00

50

100

150

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris (error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Title is adjusted to show the source of the error bars This is very important

You can see the clear difference in the size of the error bars

Variability has been visualised

The error bars overlap somewhat

What does this mean

The overlap of a set of error bars gives a clue as to the significance of the difference between two sets of data

Large overlap No overlap

Lots of shared data points within each data set

Results are not likely to be significantly different from each other

Any difference is most likely due to chance

No (or very few) shared data points within each data set

Results are more likely to be significantly different from each other

The difference is more likely to be lsquorealrsquo

-30

20

70

120

170

220

A colubris 159mm(n=10)

C latirostris 188mm(n=10)

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris(error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Our results show a very small overlap between the two sets of data

So how do we know if the difference is significant or not

We need to use a statistical test

The t-test is a statistical test that helps us determine the significance of the difference between the means of two sets of data

The Null Hypothesis (H0)

ldquoThere is no significant differencerdquo

This is the lsquodefaultrsquo hypothesis that we always testIn our conclusion we either accept the null hypothesis or reject it

A t-test can be used to test whether the difference between two means is significant bull If we accept H0 then the means are not significantly different bull If we reject H0 then the means are significantly different

Rememberbull We are never lsquotryingrsquo to get a difference We design carefully-controlled experiments and

then analyse the results using statistical analysis

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

What happens to the value of P as the confidence in the results increases

What happens to the critical value as the confidence level increases

ldquocritical valuesrdquo

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

We usually use Plt005 (95 confidence) in Biology as our data can be highly variable

Simple explanation we are working in two directions ndash within each population and across populations

ldquocritical valuesrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 22: Statistical Analysis

Put the error bars for standard deviation on our graph

Put the error bars for standard deviation on our graph

Delete the horizontal error bars

00

50

100

150

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris (error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Title is adjusted to show the source of the error bars This is very important

You can see the clear difference in the size of the error bars

Variability has been visualised

The error bars overlap somewhat

What does this mean

The overlap of a set of error bars gives a clue as to the significance of the difference between two sets of data

Large overlap No overlap

Lots of shared data points within each data set

Results are not likely to be significantly different from each other

Any difference is most likely due to chance

No (or very few) shared data points within each data set

Results are more likely to be significantly different from each other

The difference is more likely to be lsquorealrsquo

-30

20

70

120

170

220

A colubris 159mm(n=10)

C latirostris 188mm(n=10)

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris(error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Our results show a very small overlap between the two sets of data

So how do we know if the difference is significant or not

We need to use a statistical test

The t-test is a statistical test that helps us determine the significance of the difference between the means of two sets of data

The Null Hypothesis (H0)

ldquoThere is no significant differencerdquo

This is the lsquodefaultrsquo hypothesis that we always testIn our conclusion we either accept the null hypothesis or reject it

A t-test can be used to test whether the difference between two means is significant bull If we accept H0 then the means are not significantly different bull If we reject H0 then the means are significantly different

Rememberbull We are never lsquotryingrsquo to get a difference We design carefully-controlled experiments and

then analyse the results using statistical analysis

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

What happens to the value of P as the confidence in the results increases

What happens to the critical value as the confidence level increases

ldquocritical valuesrdquo

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

We usually use Plt005 (95 confidence) in Biology as our data can be highly variable

Simple explanation we are working in two directions ndash within each population and across populations

ldquocritical valuesrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 23: Statistical Analysis

Put the error bars for standard deviation on our graph

Delete the horizontal error bars

00

50

100

150

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris (error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Title is adjusted to show the source of the error bars This is very important

You can see the clear difference in the size of the error bars

Variability has been visualised

The error bars overlap somewhat

What does this mean

The overlap of a set of error bars gives a clue as to the significance of the difference between two sets of data

Large overlap No overlap

Lots of shared data points within each data set

Results are not likely to be significantly different from each other

Any difference is most likely due to chance

No (or very few) shared data points within each data set

Results are more likely to be significantly different from each other

The difference is more likely to be lsquorealrsquo

-30

20

70

120

170

220

A colubris 159mm(n=10)

C latirostris 188mm(n=10)

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris(error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Our results show a very small overlap between the two sets of data

So how do we know if the difference is significant or not

We need to use a statistical test

The t-test is a statistical test that helps us determine the significance of the difference between the means of two sets of data

The Null Hypothesis (H0)

ldquoThere is no significant differencerdquo

This is the lsquodefaultrsquo hypothesis that we always testIn our conclusion we either accept the null hypothesis or reject it

A t-test can be used to test whether the difference between two means is significant bull If we accept H0 then the means are not significantly different bull If we reject H0 then the means are significantly different

Rememberbull We are never lsquotryingrsquo to get a difference We design carefully-controlled experiments and

then analyse the results using statistical analysis

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

What happens to the value of P as the confidence in the results increases

What happens to the critical value as the confidence level increases

ldquocritical valuesrdquo

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

We usually use Plt005 (95 confidence) in Biology as our data can be highly variable

Simple explanation we are working in two directions ndash within each population and across populations

ldquocritical valuesrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 24: Statistical Analysis

00

50

100

150

200

A colubris 159mm

C latirostris 188mm

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris (error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Title is adjusted to show the source of the error bars This is very important

You can see the clear difference in the size of the error bars

Variability has been visualised

The error bars overlap somewhat

What does this mean

The overlap of a set of error bars gives a clue as to the significance of the difference between two sets of data

Large overlap No overlap

Lots of shared data points within each data set

Results are not likely to be significantly different from each other

Any difference is most likely due to chance

No (or very few) shared data points within each data set

Results are more likely to be significantly different from each other

The difference is more likely to be lsquorealrsquo

-30

20

70

120

170

220

A colubris 159mm(n=10)

C latirostris 188mm(n=10)

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris(error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Our results show a very small overlap between the two sets of data

So how do we know if the difference is significant or not

We need to use a statistical test

The t-test is a statistical test that helps us determine the significance of the difference between the means of two sets of data

The Null Hypothesis (H0)

ldquoThere is no significant differencerdquo

This is the lsquodefaultrsquo hypothesis that we always testIn our conclusion we either accept the null hypothesis or reject it

A t-test can be used to test whether the difference between two means is significant bull If we accept H0 then the means are not significantly different bull If we reject H0 then the means are significantly different

Rememberbull We are never lsquotryingrsquo to get a difference We design carefully-controlled experiments and

then analyse the results using statistical analysis

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

What happens to the value of P as the confidence in the results increases

What happens to the critical value as the confidence level increases

ldquocritical valuesrdquo

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

We usually use Plt005 (95 confidence) in Biology as our data can be highly variable

Simple explanation we are working in two directions ndash within each population and across populations

ldquocritical valuesrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 25: Statistical Analysis

The overlap of a set of error bars gives a clue as to the significance of the difference between two sets of data

Large overlap No overlap

Lots of shared data points within each data set

Results are not likely to be significantly different from each other

Any difference is most likely due to chance

No (or very few) shared data points within each data set

Results are more likely to be significantly different from each other

The difference is more likely to be lsquorealrsquo

-30

20

70

120

170

220

A colubris 159mm(n=10)

C latirostris 188mm(n=10)

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris(error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Our results show a very small overlap between the two sets of data

So how do we know if the difference is significant or not

We need to use a statistical test

The t-test is a statistical test that helps us determine the significance of the difference between the means of two sets of data

The Null Hypothesis (H0)

ldquoThere is no significant differencerdquo

This is the lsquodefaultrsquo hypothesis that we always testIn our conclusion we either accept the null hypothesis or reject it

A t-test can be used to test whether the difference between two means is significant bull If we accept H0 then the means are not significantly different bull If we reject H0 then the means are significantly different

Rememberbull We are never lsquotryingrsquo to get a difference We design carefully-controlled experiments and

then analyse the results using statistical analysis

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

What happens to the value of P as the confidence in the results increases

What happens to the critical value as the confidence level increases

ldquocritical valuesrdquo

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

We usually use Plt005 (95 confidence) in Biology as our data can be highly variable

Simple explanation we are working in two directions ndash within each population and across populations

ldquocritical valuesrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 26: Statistical Analysis

-30

20

70

120

170

220

A colubris 159mm(n=10)

C latirostris 188mm(n=10)

Graph 1 Comparing mean bill lengths in two hummingbird species A colubris and C

latirostris(error bars = standard deviation)

Species of hummingbird

Mea

n Bi

ll le

ngth

(plusmn0

1m

m)

Our results show a very small overlap between the two sets of data

So how do we know if the difference is significant or not

We need to use a statistical test

The t-test is a statistical test that helps us determine the significance of the difference between the means of two sets of data

The Null Hypothesis (H0)

ldquoThere is no significant differencerdquo

This is the lsquodefaultrsquo hypothesis that we always testIn our conclusion we either accept the null hypothesis or reject it

A t-test can be used to test whether the difference between two means is significant bull If we accept H0 then the means are not significantly different bull If we reject H0 then the means are significantly different

Rememberbull We are never lsquotryingrsquo to get a difference We design carefully-controlled experiments and

then analyse the results using statistical analysis

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

What happens to the value of P as the confidence in the results increases

What happens to the critical value as the confidence level increases

ldquocritical valuesrdquo

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

We usually use Plt005 (95 confidence) in Biology as our data can be highly variable

Simple explanation we are working in two directions ndash within each population and across populations

ldquocritical valuesrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 27: Statistical Analysis

The Null Hypothesis (H0)

ldquoThere is no significant differencerdquo

This is the lsquodefaultrsquo hypothesis that we always testIn our conclusion we either accept the null hypothesis or reject it

A t-test can be used to test whether the difference between two means is significant bull If we accept H0 then the means are not significantly different bull If we reject H0 then the means are significantly different

Rememberbull We are never lsquotryingrsquo to get a difference We design carefully-controlled experiments and

then analyse the results using statistical analysis

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

What happens to the value of P as the confidence in the results increases

What happens to the critical value as the confidence level increases

ldquocritical valuesrdquo

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

We usually use Plt005 (95 confidence) in Biology as our data can be highly variable

Simple explanation we are working in two directions ndash within each population and across populations

ldquocritical valuesrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 28: Statistical Analysis

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

What happens to the value of P as the confidence in the results increases

What happens to the critical value as the confidence level increases

ldquocritical valuesrdquo

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

We usually use Plt005 (95 confidence) in Biology as our data can be highly variable

Simple explanation we are working in two directions ndash within each population and across populations

ldquocritical valuesrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 29: Statistical Analysis

P value = 01 005 002 001confidence 90 95 98 99

degrees of freedom

1 631 1271 3182 6366 2 292 430 696 992 3 235 318 454 584 4 213 278 375 460 5 202 257 337 403 6 194 245 314 371 7 189 236 300 350 8 186 231 290 336 9 183 226 282 325

10 181 223 276 317

We can calculate the value of lsquotrsquo for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need

Example two-tailed t-table

ldquoDegrees of Freedom (df)rdquo is the total sample size minus two

We usually use Plt005 (95 confidence) in Biology as our data can be highly variable

Simple explanation we are working in two directions ndash within each population and across populations

ldquocritical valuesrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 30: Statistical Analysis

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 31: Statistical Analysis

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 32: Statistical Analysis

005

t was calculated as 215 (this is done for you)

t cv 215

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 33: Statistical Analysis

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 34: Statistical Analysis

2069

005

t was calculated as 215 (this is done for you)

t cv 215 gt 2069

If t lt cv accept H0 (there is no significant difference)If t gt cv reject H0 (there is a significant difference)

Conclusion ldquoThere is a significant difference in the wing spans of the two populations of birdsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 35: Statistical Analysis

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 36: Statistical Analysis

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 37: Statistical Analysis

20452045

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is no significant difference in the size of shells between north-side and south-side snail populationsrdquo

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 38: Statistical Analysis

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 39: Statistical Analysis

20862086

2-tailed t-table source httpwwwmedcalcorgmanualt-distributionphp

ldquoThere is a significant difference in the resting heart rates between the two groups of swimmersrdquo

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 40: Statistical Analysis

Excel can jump straight to a value of P for our resultsOne function (=ttest) compares both sets of data

As it calculates P directly (the probability that the difference is due to chance) we can determine significance directly

In this case P=000051

This is much smaller than 0005 so we are confident that we can

reject H0

The difference is unlikely to be due to chance

Conclusion There is a significant difference in bill length between A colubris and C latirostris

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 41: Statistical Analysis

Two tails we assume data are normally distributed with two lsquotailsrsquo moving away from mean Type 2 (unpaired) we are comparing one whole population with the other whole population

(Type 1 pairs the results of each individual in set A with the same individual in set B)

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 42: Statistical Analysis

95 Confidence Intervals can also be plotted as error bars

These give a clearer indication of the significance of a resultbull Where there is overlap there is not a significant differencebull Where there is no overlap there is a significant difference bull If the overlap (or difference) is small a t-test should still be carried out

no overlap

=CONFIDENCENORM(005stdevsamplesize)eg =CONFIDENCENORM(005C1510)

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 43: Statistical Analysis

Error bars can have very different purposes

Standard deviation bull You really need to know thisbull Look for relative size of barsbull Used to indicate spread of most

of the data around the meanbull Can imply reliability of data

95 Confidence Intervalsbull Adds value to labs where we are

looking for differences bull Look for overlap not size

bull Overlap no sig diff bull No overlap sig dif

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 44: Statistical Analysis

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 45: Statistical Analysis

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Students watched a one-minute video of a lecture In one video the lecturer was fluent and engaging In the other video the lecturer was less fluent

They predicted how much they would learn on the topic (genetics) and this was compared to their actual score

(Error bars = standard deviation)

Is there a significant difference in the actual learning

n=21 n=21

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 46: Statistical Analysis

Interesting Study Do ldquoBetterrdquo Lecturers Cause More Learning

Find out more here httppriceonomicscomis-this-why-ted-talks-seem-so-convincing

Evaluate the study 1 What do the error bars (standard deviation) tell us about reliability 2 How valid is the study in terms of sufficiency of data (population sizes (n))

n=21 n=21

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 47: Statistical Analysis

Dog fleas jump higher that cat fleas winner of the IgNobel prize for Biology 2008

httpw

ww

youtubecomw

atchv=fJEZg4QN

760

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 48: Statistical Analysis

P value = 01 005 002 001 0005confidence 90 95 98 99 9950

degrees of freedom

1 631 1271 3182 6366 12734 2 292 430 696 992 1409 3 235 318 454 584 745 4 213 278 375 460 560 5 202 257 337 403 477 6 194 245 314 371 432 7 189 236 300 350 403 8 186 231 290 336 383 9 183 226 282 325 369

10 181 223 276 317 358

degrees of freedom

11 180 220 272 311 350 12 178 218 268 305 343 13 177 216 265 301 337 14 176 214 262 298 333 15 175 213 260 295 329 16 175 212 258 292 325 17 174 211 257 290 322 18 173 210 255 288 320 19 173 209 254 286 317 20 172 209 253 285 315

degrees of freedom

21 172 208 252 283 314 22 172 207 251 282 312 23 171 207 250 281 310 24 171 206 249 280 309 25 171 206 249 279 308 26 171 206 248 278 307 27 170 205 247 277 306 28 170 205 247 276 305 29 170 205 246 276 304 30 170 204 246 275 303

degrees of freedom

31 170 204 245 274 302 32 169 204 245 274 302 33 169 203 244 273 301 34 169 203 244 273 300 35 169 203 244 272 300 36 169 203 243 272 299 37 169 203 243 272 299 38 169 202 243 271 298 39 168 202 243 271 298 40 168 202 242 270 297

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 49: Statistical Analysis

Cartoon from httpwwwxkcdcom552

Correlation does not imply causation but it does waggle its eyebrows suggestively and gesture furtively while mouthing look over there

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 50: Statistical Analysis

From MrTrsquos Excel Statbook

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 51: Statistical Analysis

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Interpreting Graphs See ndash Think ndash Wonder

See What is factual about the graph bull What are the axesbull What is being plottedbull What values are present

Think How is the graph interpretedbull What relationship is presentbull Is cause impliedbull What explanations are possible and

what explanations are not possible

Wonder Questions about the graphbull What do you need to know more about

See ndash Think - WonderVisible Thinking Routine

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 52: Statistical Analysis

httpdiabetes-obesityfindthedataorgb240Correlations-between-diabetes-obesity-and-physical-activity

Diabetes and obesity are lsquorisk factorsrsquo of each other There is a strong correlation between them but does this mean one causes the other

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 53: Statistical Analysis

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 54: Statistical Analysis

Correlation does not imply causality

Pirates vs global warming from httpenwikipediaorgwikiFlying_Spaghetti_MonsterPirates_and_global_warming

Where correlations exist we must then design solid scientific experiments to determine the cause of the relationship Sometimes a correlation exist because of confounding variables ndash conditions that the correlated variables have in common but that do not directly affect each other

To be able to determine causality through experimentation we need bull One clearly identified independent variablebull Carefully measured dependent variable(s) that can be attributed to change in the

independent variablebull Strict control of all other variables that might have a measurable impact on the

dependent variable

We need sufficient relevant repeatable and statistically significant data

Some known causal relationships bull Atmospheric CO2 concentrations and global warmingbull Atmospheric CO2 concentrations and the rate of photosynthesisbull Temperature and enzyme activity

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 55: Statistical Analysis

Flamenco Dancer by Steve Coreyhttpwwwflickrcomphotos22016744N067952552148

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
Page 56: Statistical Analysis

i-Biologynet

This is a Creative Commons presentation It may be linked and embedded but not sold or re-hosted

Please consider a donation to charity via Biology4GoodClick here for more information about Biology4Good charity donations

IBiologyStephen

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83