Basic Statistical Concepts M. Burgman & J. Carey 2002.

33
Basic Statistical Concepts M. Burgman & J. Carey 2002

Transcript of Basic Statistical Concepts M. Burgman & J. Carey 2002.

Page 1: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

Basic Statistical Concepts

M. Burgman & J. Carey 2002

Page 2: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

Statistical Population

• The entire underlying set of individuals from which samples are drawn.

e.g. 0.25m2 quadrats are used to count barnacles on a sea shore.

• The population is defined implicitly by the sampling frame.

Page 3: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

Strategies

• Define survey objectives

• Define population parameters to estimate

• Implement sampling strategy

i) measure every individual (cost, time, practicality especially if destructive)

ii) measure a representative portion of the population (a sample)

Page 4: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

Statistical Sample

• An aggregate of objects from which measurements are taken.

• A representative subset of a population.

Page 5: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

Simple Random Sampling

• Every unit and combination of units in the population has an equal chance of selection.

a) with replacement

b) without replacement

c) finite and infinite populations

Page 6: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

Sampling Objectives

• To obtain an unbiased estimate of a population mean

• To assess the precision of the estimate (i.e. calculate the standard error of the mean)

• To obtain as precise an estimate of the parameters as possible for time and money spent

Page 7: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

(xi - )2

n

(xi - x)2

n - 1

(xi - x)2

n - 1

Statistics of Dispersion

Population variance 2 =

Sample variance s2 =

Sample standard deviation s =

Page 8: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

s2

n

s x

(xi - x ) (yi - y ) n - 1

Statistics of Dispersion

Standard error of the mean sx =

Coefficient of variation CV =

Covariance sxy =

Page 9: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

Expectations and Variances

E(X+b) = E(X) + b

E(aX) = aE(X)

E(X+Y) = E(X) + E(Y)

V(X+b) = V(X)

V(aX) = a2V(X)

V(X+Y) = V(X) + V(Y) + 2Cov(X,Y)

Page 10: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

Confidence Limits

• For the mean = x t[, n-1]

• This formula sets confidence limits to means of samples from a normally distributed population.

sn

Page 11: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

Confidence Limits

• Confidence limits of the mean define a region that we expect will enclose the true mean.

• The likelihood that this is true is determined by . If we set at 5% (hence specifying 95% confidence intervals), then the region enclosed by the confidence intervals will capture the true mean 95 times out of 100.

Page 12: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

Confidence Limits

• The same formula may be used to set confidence limits to any statistic as long as it follows the normal distribution,

e.g. the median,

the average (absolute) deviation, standard deviation (s),

coefficient of variation, or

skewness.

Page 13: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

How many samples?

where :• CV is coefficient of variation (expressed as a

%) of samples in a pilot survey• t is Student's t value for a specified degree of

certainty and the number of samples used to estimate the parameters

• E is specified error limits (expressed as a % of the mean)

n

n = t2 CV2

E2

Page 14: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

Measurement Error

• Measured variation may be decomposed into

natural variation + measurement error

• Measurement error may be reduced by improving sampling protocols and instrumentation

• Reducing measurement error increases confidence in estimates without increasing the number of samples.

• Precision (variation) v. accuracy (bias)

Page 15: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

Components of Measurement Error

• Systematic errors• Random errors

Causes

• Measurement assumptions

(shape, size, allometry)

• Instrument error

• Operator error

Page 16: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

Kinds of Uncertainty

1. Epistemic Uncertainty

• inherent environmental variation

• variation in population responses due to demographic structure

• imperfect knowledge

• model mis-specification

• measurement error (assessment error)

• ignorance

Page 17: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

Kinds of Uncertainty

2. Semantic Uncertainty

• Ambiguity - interpretation of a phrase in two or more distinct ways.

“Juvenile Court to Try Shooting Defendant”

“Local High School Dropouts Cut in Half”

• Vagueness - leads to borderline cases.

e.g. tall; endangered; adult

Page 18: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

Kinds of Uncertainty

More examples of vagueness:

• Tree crown

tree foliage bounded by the first healthy branch forming part of the main crown and extending as far or further than any branch above it.

forked trees? dead branches?

Page 19: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

Kinds of Uncertainty

More examples of vagueness:

• Epilimnion

the upper layer of water in a lake, bounded by a thermocline

• Soil horizon

a relatively uniform soil layer, differentiated by contrasts in mineral or organic properties.

Page 20: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

Sampling Design Criteria

• Operational simplicity

• Unambiguous interpretation

Page 21: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

Null-Hypothesis Tests

An example of hypothesis testing in which management alternatives are judged on the basis of the outcome of the test.

Hypothesis Symbol Description

Null H0 The strategy has no

hypothesis effect.

Alternative H1 The strategy is hypothesis effective

Page 22: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

Statistical Outcomes in Null Hypothesis Testing

Test Result

Significant Not significant (H0 rejected) (H0 not rejected)

Difference correct Type II error (H0 false) ()

No difference Type I error correct (H0 true) ()

Reality

Page 23: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

The Character of Error TypesType I errors• Alarmism/Over-reaction• Incorrectly accepting a (false) alternative

hypothesis• Concluding (incorrectly) that there is an impact

Type II errors• False confidence/Cornucopia• Incorrectly "accepting" a (false) null hypothesis• Concluding (incorrectly) that there is no impact

Page 24: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

t-tests

A t-test of the hypothesis that two sample means come from a population with equal

i.e. H0: 1= 2

t = Y1 - Y2

1n

(s12 + s2

2)

Page 25: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

Distributions of Test Statistics

distribution of mean of actual population

distribution of the null hypothesis, assumed

to be true until rejected

P(s

tatis

tic)

critical value

Page 26: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

Assumptions

The assumption of independence: correlation and autocorrelation

1. if error in one object is related to error in others, there will be bias eg. measure one and compare others.

2. the effective sample size may be less than the number of samples if measurements are correlated in space or time.

Page 27: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

The effects of the non-independence of data on errors of interpretation of statistical tests

Non-independence

Among Within treatments treatments

Positive Increased Increased Type II Type I

Negative Increased Increased Type I Type II

Correlation

Page 28: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

Randomization Tests

Jaw lengths of Golden Jackals:

Males: 120, 107, 110, 116, 114,

111, 113, 117, 114, 112

Females: 110, 111, 107, 108, 110,

105, 107, 106, 111, 111

Is there a difference in jaw length between males and females?

Page 29: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

1.Calculate means for males and for females.

2.Calculate the difference between the means D0 = xm - xf = 4.8

3.Randomly allocate 10 sample lengths to each of 2 groups

4.Calculate Di , the difference between means for these 2 groups

5.Repeat Steps 3 & 4 many times

Randomization Tests

Page 30: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

• If D0 is unusually large, the observed data are unlikely to have arisen if there was no difference between males and females.

Randomization Tests

-4 0 2 4Difference in jaw length (mm)

0

200

400

600

Fre

quen

cy

D0 = 4.8

-2

Page 31: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

Randomization Tests

• From 5000 runs,

only 9 Dis were greater than or equal to 4.8.

• 9/5000 = 0.0018.

(t-test: pHo = 0.0013)

Page 32: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

Confidence Limits by Randomization

• For 95% confidence limits, the upper and lower limits, U and L, are such that they enclose 95% of the randomization distribution.

• For 99% confidence, L and U must give values at the 0.5% and 99.5% points on the distribution.

Page 33: Basic Statistical Concepts  M. Burgman & J. Carey 2002.

Can do randomization tests in lieu of:• paired comparisons• ANOVA• multiple regression

Randomization Tests