PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling...

104
PCB 3043L - General Ecology Data Analysis

Transcript of PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling...

Page 1: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

PCB 3043L - General Ecology

Data Analysis

Page 2: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

OUTLINE• Organizing an ecological study• Basic sampling terminology• Statistical analysis of data

– Why use statistics?– Describing data

• Measures of central tendency• Measures of spread• Normal distributions

• Using Excel– Producing tables– Producing graphs– Analyzing data– Statistical tests

• T-Tests• ANOVA• Regression

Page 3: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

Organizing an ecological study

• What is the aim of the study?• What is the main question being asked?• What are your hypotheses?• Collect data• Summarize data in tables• Present data graphically• Statistically test your hypotheses• Analyze the statistical results• Present a conclusion to the proposed question

Page 4: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

Basic sampling terminology

• Variables

• Populations

• Samples

• Parameters

• Statistics

Page 5: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

What is a variable?• Variable: any defined

characteristic that varies from one biological entity to another.

• Examples: plant height, bird weight, human eye color, no. of tree species

• If an individual is selected randomly from a population, it may display a particular height, weight, etc.

• If several individuals are selected, their characteristics may be very similar or very different.

Page 6: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

What is a population?

• Population: the entire collection of measurements of a variable of interest.

• Example: if we are interested in the heights of pine trees in Everglades National Park (Plant height is our variable) then our population would consist of all the pine trees in Everglades National Park .

Page 7: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

What is a sample?

• Sample: smaller groups or subsets of the population which are measured and used to estimate the distribution of the variable within the true population

• Example: the heights of 100 pine trees in Everglades National Park may be used to estimate the heights of trees within the entire population (which actually consists of thousands of trees)

Page 8: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

What is a parameter?

• Parameter: any calculated measure used to describe or characterize a populationpopulation

• Example: the average height of pine trees in Everglades National Park

Page 9: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

What is a statistic?

• Statistic: an estimate of any population parameter

• Example: the average height of a sample of 100 pine trees in Everglades National Park

Page 10: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

Why use statistics?• It is not always possible to obtain measures and calculate

parameters of variables for the entire population of interest

• Statistics allow us to estimate these values for the entire population based on multiple, random samples of the variable of interest

• The larger the number of samples, the closer the estimated measure is to the true population measure

• Statistics also allow us to efficiently compare populations to determine differences among them

• Statistics allow us to determine relationships between variables

Page 11: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

Statistical analysis of data

• Measures of central tendency• Measures of dispersion and variability

Site 1 Site 2

5 4

7 2

3 8

8 3

6 7

Heights of pine trees at 2 sites in Everglades National Park

Page 12: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

• Where is the center of the distribution?

mean ( or μ): arithmetic mean……

median: the value in the middle of the ordered data set

mode: the most commonly occurring value

Example data set : 1, 2, 2, 2, 3, 5, 6, 7, 8, 9, 10

Mean = (1 + 2 + 2 + 2+ 3 + 5 + 6 + 7 + 8 + 9 + 10)/11 = 55/11 = 5Median = 1, 2, 2, 2, 3, 5, 6, 7, 8, 9,10 = 5

1, 2, 2, 2, 3, 5, 6, 7, 8, 9,10,11 = (5+6)/2 = 5.5Mode = 1, 2, 2, 2, 3, 5, 6, 7, 8, 9, 10 = 2

n

xx

Measures of central tendency

Page 13: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

• How widely is the data distributed?

range: largest value minus smallest value

variance (s2 or σ2) ………….………….

standard deviation (s or σ)………………… 2

1

)( 22

n

xxi

Large spread Small spread

Measures of dispersion and variability

Page 14: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

Example data set: 0, 1, 3, 3, 5, 5, 5, 7, 7, 9, 10

Variance = 9.8Standard Deviation = 3.13Range = 10

Example data set: 0, 10, 30, 30, 50, 50, 50, 70, 70, 90, 100

Variance = 980Standard Deviation = 31.30Range = 100

0

0.5

1

1.5

2

2.5

3

3.5

0 1 3 5 7 9 10Value

Num

ber o

f Occ

uren

ces

0

0.5

1

1.5

2

2.5

3

3.5

0 10 30 50 70 90 100Value

Num

ber o

f Occ

uren

ces

Measures of dispersion and variability

Page 15: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

Normal distribution of data

• A data set in which most values are around the mean, with fewer observations towards the extremes of the range of values

• The distribution is symmetrical about the mean

Page 16: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

Proportions of a Normal Distribution

• A normal population of 1000 body weights• μ = 70kg σ = 10kg• 500 weights are > 70kg• 500 weights are < 70 kg

Weights of Black Bears in Bunting Park

0

100

200

300

400

500

0 10 20 30 40 50 60 70 80 90 100 110 120 130 140

Weights (kg)

No

. o

f b

ears

Page 17: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

Proportions of a Normal Distribution

• How many bears have a weight > 80kg• μ = 70kg σ = 10kg X = 80kg• We use an equation to tell us how many standard deviations

from the mean the X value is located: = =

• We then use a special table to tell us what proportion of a normal distribution lies beyond this Z value

• This proportion is equal to the probability of drawing at random a measurement (X) greater than 80kg

Weights of Black Bears in Bunting Park

0

100

200

300

400

500

0 10 20 30 40 50 60 70 80 90 100 110 120 130 140

Weights (kg)N

o.

of

bea

rs

Z = X – μ σ

Z = 80 – 70 10

1

Page 18: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

Z table

• Look for Z value on table (1.0)

• Find associated P value (0.1587)

• P value states there is a 15.87% ((0.1587/1)x100) chance that a bear selected from the population of 1000 bears measured will have a weight greater than 80kg

Page 19: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

Probability distribution tables

• There are multiple probability tables for different types of statistical tests.

e.g. Z-Table, t-Table, Χ2-Table

• Each allows you to associate a “critical value” with a “P value”

• This P value is used to determine the significance of statistical results

Page 20: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

Using Excel

• Program used to organize data

• Produce tables

• Perform calculations

• Make graphs

• Perform statistical tests

Page 21: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

Organizing data in tables

• Allows you to arrange data in a format that is best for analysis

• The following are the steps you would use:

Page 22: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 23: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 24: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 25: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

Performing calculations

• Allows you to perform several calculations

• Sum, Average, Variance, Standard deviation

• Basic subtraction, addition, multiplication

• More complex formulas

Page 26: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 27: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 28: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 29: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 30: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 31: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 32: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 33: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 34: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 35: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 36: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 37: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 38: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 39: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 40: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

• Bar Charts…….

• Scatter Plots………………….

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 2 3

-1

0

1

2

3

4

5

6

7

8

9

0 0.2 0.4 0.6 0.8 1 1.2

Making graphs

Page 41: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 42: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 43: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 44: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 45: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 46: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 47: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 48: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 49: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 50: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 51: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 52: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 53: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 54: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 55: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

• Bar Charts…….

• Scatter Plots………………….

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 2 3

-1

0

1

2

3

4

5

6

7

8

9

0 0.2 0.4 0.6 0.8 1 1.2

Making graphs

Page 56: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 57: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 58: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 59: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 60: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 61: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 62: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 63: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 64: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

Analyzing Data in Excel

Statistical tests can be done to determine:

• Whether or not there is a significant difference between two data sets (Student’s t-test)

• Whether or not there is a significant difference between more than two data sets (ANOVA)

• Whether or not there is a significant relationship between two variables (Regression analysis)

Page 65: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

Analyzing Data in Excel

The following steps must be followed:

1. Choose an appropriate statistical test

2. State H0 and HA

3. Run test to produce Test Statistic

4. Examine P-value

5. Decide to accept or reject H0

Page 66: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

• Normally, you would have to calculate the critical value and look up the P value on a table

• All tests done in Excel provide the P value for you• This P value is used to determine the significance of

statistical results• This P value must be compared to an α value• α value is usually 0.05 or less (e.g. 0.01)• Less than 5% chance that the null hypothesis is true• The lower the α value the more certain we about

rejecting the null Hypothesis • First thing you must do is select which statistical test

you want to perform• This is how it is done……..

Analyzing Data in Excel

Page 67: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 68: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 69: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 70: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 71: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

t-Tests

• Used to compare the means of two populations and answer the question: Is there a significant difference between the two populations?

• Example: Is there a significant difference between the average height of pine trees from 2 sites in Everglades National Park?

• You cannot use this test to compare two different types of data (e.g. water depth data and soil depth data).

• It can only compare two sets of data based on the same data type (e.g. water depth data from two different sites)

• The two data sets that are being compared must be presented in the same units. (e.g. you can compare two sets of data if both are recorded in days. You cannot compare data recorded in units of days with data recorded in units of months)

Page 72: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

• Your Null Hypothesis is always:

There is no significant difference between the two compared populations (μ1= μ2)

• Your Alternative Hypothesis is always:

There is a difference between the two compared populations (μ1 ≠ μ2)

1. Choose an appropriate statistical test2. State H0 and HA 3. Run test to produce Test Statistic4. Examine P-value5. Decide to accept or reject H0

Page 73: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

1. Choose an appropriate statistical test2. State H0 and HA 3. Run test to produce Test Statistic4. Examine P-value5. Decide to accept or reject H0

Page 74: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 75: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 76: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 77: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 78: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 79: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

t-Tests1. Choose an appropriate statistical test2. State H0 and HA 3. Run test to produce Test Statistic4. Examine P-value5. Decide to accept or reject H0

• When you run the test, look for the p-value

• If p > 0.05 then fail to reject your Null Hypothesis and state that “there is no significant difference between the two compared populations”

• If p < 0.05 then reject your Null Hypothesis and state that “there is a significant difference between the two compared populations”

Page 80: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

t-Tests1. Choose an appropriate statistical test2. State H0 and HA 3. Run test to produce Test Statistic4. Examine P-value5. Decide to accept or reject H0

• When you run the test, look for the p-value

• Our results show P = 0.09903

• Therefore P > 0.05 (This means that there is greater than a 5% chance that our null hypothesis is true)

• So we must fail to reject the Null Hypothesis and state that “there is no significant difference between the two compared populations”

Page 81: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

ANOVA

• Used to compare the means of more than two populations and answer the question: Is there a significant difference between the populations?

• Example: Is there a significant difference between the average height of pine trees from 4 sites in Everglades National Park?

• For comparing a particular feature of two or more populations, use a Single Factor ANOVA

• For comparing a particular feature of two or more populations, subdivided into two groups, use a Two Factor ANOVA

0

10

20

30

40

50

60

70

80

90

100

Number of Students

MicroEcoBuisinessStatistics

Page 82: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

• Your Null Hypothesis is always:

There is no significant difference between the compared populations (μ1 = μ2 = μ3 = μ4 …..)

• Your Alternative Hypothesis is always:

There is a difference between the compared populations (μ1 ≠ μ2 ≠ μ3 ≠ μ4 …..)

1. Choose an appropriate statistical test2. State H0 and HA 3. Run test to produce Test Statistic4. Examine P-value5. Decide to accept or reject H0

Page 83: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

1. Choose an appropriate statistical test2. State H0 and HA 3. Run test to produce Test Statistic4. Examine P-value5. Decide to accept or reject H0

Page 84: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 85: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 86: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 87: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

ANOVA1. Choose an appropriate statistical test2. State H0 and HA 3. Run test to produce Test Statistic4. Examine P-value5. Decide to accept or reject H0

• When you run the test, look for the p-value

• If p > 0.05 then fail to reject your Null Hypothesis and state that “there is no significant difference between the compared populations”

• If p < 0.05 then reject your Null Hypothesis and state that “there is a significant difference between at least two of the compared populations”

0

10

20

30

40

50

60

70

80

90

100

Number of Students

MicroEcoBuisinessStatistics

Page 88: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

ANOVA1. Choose an appropriate statistical test2. State H0 and HA 3. Run test to produce Test Statistic4. Examine P-value5. Decide to accept or reject H0

• When you run the test, look for the p-value

• Our results show P = 0.002197

• Therefore P < 0.05 (This means that there is less than a 5% chance that our null hypothesis is true)

• So we must reject your Null Hypothesis and state that “there is a significant difference between at least two of the compared populations”

0

10

20

30

40

50

60

70

80

90

100

Number of Students

MicroEcoBuisinessStatistics

Page 89: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

ANOVA

• Remember:The ANOVA result will only tell you thati) None of the data sets are significantly

different from each otherOR

ii) At least two of the data sets among the data sets being compared are significantly different

• If there is a significant difference between at least two data sets, it will not tell you which two.

Page 90: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

Regression analysis

• Used to determine whether or not there is a linear relationship between two variables and answer the question: Is there a significant linear relationship between two variables?

• Example: Is there a significant relationship between the average height of pine trees and soil depth in Everglades National Park?

• It basically creates an equation (or line) that best predicts Y values based on X values.

• You cannot use this test to compare populations. It only compares variables.

• You are looking at two different variables (e.g. water depth (cm) and plant abundance (no. of individuals), so the data sets do not have to be presented in the same units

0.00

10.00

20.00

30.00

40.00

50.00

60.00

0 1 2 3 4 5 6

Price of Whiskey ($)

Page 91: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

• Your Null Hypothesis is always:

There is no significant linear relationship between the two variables

• Your Alternative Hypothesis is always:

There is a significant linear relationship between the two variables

1. Choose an appropriate statistical test2. State H0 and HA 3. Run test to produce Test Statistic4. Examine P-value5. Decide to accept or reject H0

Page 92: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

Example: R square value of 0.04 • The regression line does not fit the

data well • Many of the points lie far from the

line, so there is not a defined linear relationship between the two variables

• “x” cannot be used to predict “y”

Example: R square value of 0.94• The regression line fits the data well• The points all lie fairly close to the

line, so there is a defined linear relationship between the two variables

• “x” can be used to predict “y”

0

0.2

0.4

0.6

0.8

1

1.2

0 0.2 0.4 0.6 0.8 1 1.2

Price of Whiskey ($)

Mo

ney

Sp

ent

by

TA

($)

0.00

10.00

20.00

30.00

40.00

50.00

60.00

0 1 2 3 4 5 6

Price of Whiskey ($)

Mo

ney

Sp

ent

by

TA

($)

• R squared: how well “y” can be predicted by “x”, i.e. how strong the linear relationship is between the two variables.

• The closer R square is to 0, the less well it fits the data. • The closer R square is to 1, more it fits the data.

Page 93: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

1. Choose an appropriate statistical test2. State H0 and HA 3. Run test to produce Test Statistic4. Examine P-value5. Decide to accept or reject H0

Page 94: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 95: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 96: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 97: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 98: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 99: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Page 100: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

Regression analysis1. Choose an appropriate statistical test2. State H0 and HA 3. Run test to produce Test Statistic4. Examine P-value5. Decide to accept or reject H0

• When you run the test, look for the Significance F or Sample p-value

• If p > 0.05 then fail to reject your Null Hypothesis and state that “There is no significant linear relationship between the two variables”

• If p < 0.05 then reject your Null Hypothesis and state that “There is a significant linear relationship between the two variables”

0.00

10.00

20.00

30.00

40.00

50.00

60.00

0 1 2 3 4 5 6

Price of Whiskey ($)

Page 101: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

1. Choose an appropriate statistical test2. State H0 and HA 3. Run test to produce Test Statistic4. Examine P-value5. Decide to accept or reject H0

• When you run the test, look for the p-value

• Our results show Significance F or Sample p-value = 1.65E08 = 0.0000000165

• Therefore P < 0.05 (This means that there is less than a 5% chance that our null hypothesis is true)

• So we must reject your Null Hypothesis and state that “There is a significant linear relationship between the two variables”

• Next look at the R squared value

• Our results show R squared = 0.975

• Therefore the line fits the data well

• “x” can be used to predict “y”

Regression analysis

0.00

10.00

20.00

30.00

40.00

50.00

60.00

0 1 2 3 4 5 6

Price of Whiskey ($)

Page 102: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

Ecological study

• What is the aim of the study?• What is the main question being asked?• What are your hypotheses?• Collect data• Summarize data in tables• Present data graphically• Statistically test your hypotheses• Analyze the statistical results• Present a conclusion to the proposed question

Page 103: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

Aim: To determine whether or not there are changes in heights of Pine trees with distance from the edge of a forest trail in Everglades National Park.Hypotheses:HO: There is no significant relationship between distance from the edge of the trail and Pine tree heightHA: There is a significant relationship between distance from the edge of the trail and Pine tree heightResults:

Discussion/Conclusion:The gap created by the trail may be adversely affecting Pine trees, such that they are shorter near the trail and become taller with distance from the trail.

Distance from trail (m) Plant heights (m)

0 2.1

5 2.7

10 2.9

15 3.1

20 3.4

25 3.7

30 3.8

35 4.5

40 4.6

45 4.8

50 5.6

SUM 41.2

AVERAGE 3.74

STANDARD DEVIATION 1.04

Change in tree height with distance from forest trail

0

1

2

3

4

5

6

0 10 20 30 40 50 60

Distance from trail

Tre

e h

eig

ht

(m)

• P = 1.65E-08 Since P < 0.05, reject Ho• Therefore, there is a significant relationship

between distance from the edge of the trail and Pine tree height

• R Square = 0.97, so there is a strong positive linear relationship between distance from the trail and plant height

Average tree height of pine trees along transect from forest trail to interior forest at ENP

Page 104: PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

Three questions:

1. T-test

2. Single factor ANOVA

3. Regression analysis

Assignment – Worksheet 1