Nonparametric Statistics Lecture 9. Small Sample, Non-normal Population If the sample was large, the...

Nonparametric Statistics

Lecture 9

Small Sample, Non-normal Population

If the sample was large, the Central Limit Theorem would be applicable for testing hypotheses about the mean.If the population was normal, the sampling distribution of the mean is exactly a normal distribution to start with.If the sample is small and the population non-normal, what do we do?Nonparametric statistics is a sub-field of statistics that creates inferences concerning populations that cannot be assumed to follow any particular distribution.

One –Sample Example

Suppose that a nurse has been instructed to perform a procedure in a new way . Researchers recorded the change in the number of minutes it took the nurse to perform the procedure.The data is0.6, -0.5, 1.1, 2.4, 3.5, 2.0-0.1, 1.0, 2.1, -0.6, -0.2We would be hard pressed to say that this data even approximately follows a normal distribution.

Assumption of normality for small sample example

There are only 11 observations and we might be uncomfortable claiming that this distribution looks normal. Instead, it looks more uniform.

The Sign Test – 5 Steps

Assumptions: Random, independent sample

Hypotheses:Null hypothesis: Median equals zero

Alternative hypothesis: Median does not equal zero

Test statistic: p=7/11, interested in comparing proportion that are greater than zero with one-half.

The Sign Test – 5 Steps, cont.

P-value: Need exact calculation since CLT doesn’t apply with small samples. 95% CI for p with small samples: (0.308, 0.891)

Conclusion: Since 0.5 is included in the 95% confidence interval, we can’t say that the median is significantly different than zero at the 0.05 level. (We fail to reject the null hypothesis.)

The Signed Rank Test – 5 stepsAssumptions:

The measurement is continuousIndependent, random sample from the populationDistribution is symmetric

Hypotheses: H0: Median of the distribution is 0

HA: Median of distribution is non-zero

Test Statistic: Minimum of the rank sumsP-value: from the computer!

For this example, p=0.0439

Conclusion: As per usual.

Calculation of Signed Rank Test Statistic

Order observations from smallest to largest in absolute value

|Y|(1) ≤ |Y|(2) ≤ … ≤ |Y|(n)

So from example,|-0.1| < |-0.2| < |-0.5| < |-0.6| = 0.6 < 1.0 < 1.1 < 2.0 < 2.1 < 2.4 < 3.5 Assign Ranks to these absolute values

1, 2, … , nIn example, 1, 2, … , 11

Signed Rank Test Statistic, cont…

Arrange the ranks into two groups: those with actual values that are smaller and those that are larger than zero. Sum the ranks for both the negative and positive valued observations, separately.Here, for negative values, sum of ranks = 1+2+3+4.5 = 10.5For positive valuessum of ranks = 4.5+6+7+8+9+10+11 = 55.5Test Statistic = smallest rank sum

P-values for signed rank test

For critical values and p-values, look at tables/computer generated p-values.This procedure is unavailable in the Student version of SPSS. It is available in SAS and the regular version of SPSS.

Comments on Signed Rank Test

More “powerful” than the Sign Test, but requires more assumptionsOne-sided tests are possibleRobust to outliersSome books/programs use the sum of the ranks of the positive values as the test statistic – p-values are always the sameNonparametric confidence intervals are also available from some software programs.For tied observations, use average rank for each tied observation.

Nonparametric statistics for small, non-normal samples

Paired DataThe same as for univariate data, except perform the test using the differences rather than the raw data.

Two Independent GroupsMann-Whitney Rank Sum Test (Ch. 24)

• Procedure is similar to the Sign Rank test, except that instead of dividing observations according to whether they are positive or negative, we divide observations according to group membership.

• Assumptions include (1) independent, random samples, (2) independently selected groups, and (3) the shape and spread of the two distributions are the same

Paired Differences Example

Wife 0.4 0.5 1.0 0.2 0.9 1.0 1.2 0.1 0.6 0.4 0.2

Husband 0.5 0.4 0.7 0.0 0.6 1.2 0.7 0.1 0.5 0.1 0.1

Difference -0.1 0.1 0.3 0.2 0.3 -0.2 0.5 0.0 0.1 0.3 0.1

Study Hypothesis: Men and women spend different amounts of time reading/watching the news.

The Signed Rank Test – 5 stepsAssumptions:

The measurement (difference) is continuousIndependent, random sample from the populationDistribution of difference is symmetric

Hypotheses: H0: Median of the difference is 0

HA: Median of difference is non-zero

Test Statistic: Minimum of the rank sumsP-value: from the computer!

For this example,

Computer Outputs - Paired

Data for wives and husbands are in two separate columns, with matched observations in the same row.AnalyzeNonparametric tests2 Related Samples…

Wilcoxon Signed Ranks Test

8a 5.88 47.00

2b 4.00 8.00

Negative Ranks

Positive Ranks

HUSBAND - WIFEN Mean Rank Sum of Ranks

HUSBAND < WIFEa.

HUSBAND > WIFEb.

WIFE = HUSBANDc.

Test Statisticsb

-2.007a

Asymp. Sig. (2-tailed)

HUSBAND -WIFE

Based on positive ranks.a.

Wilcoxon Signed Ranks Testb.

Computer Outputs - Paired

Data for wives and husbands are in two separate columns, with matched observations in the same row.AnalyzeNonparametric tests2 Related Samples…

Sign Test

Frequencies

Negative Differencesa

Positive Differencesb

Ties c

HUSBAND - WIFEN

HUSBAND < WIFEa.

HUSBAND > WIFEb.

WIFE = HUSBANDc.

Test Statisticsb

.109aExact Sig. (2-tailed)

HUSBAND -WIFE

Binomial distribution used.a.

Sign Testb.

Two Independent Groups Example

Wife 0.4 0.5 1.0 0.2 0.9 1.0 1.2 0.1 0.6 0.4 0.2

Husband 0.5 0.4 0.7 0.0 0.6 1.2 0.7 0.1 0.5 0.1 0.1

Study Hypothesis: Men and women spend different amounts of time reading/watching the news.

The Mann-Whitney Test – 5 stepsAssumptions:

Independent, random samplesIndependently selected groupsThe shape and spread of the two distributions are the same

Hypotheses: H0: Group medians are the same

HA: Group medians are different

Test Statistic: rank sumsP-value: from the table or computer!

For this example,

Computer Outputs - Independent

Data for wives & husbands are in the same column; a second column indicates whether each observation is for the wife or husband*.AnalyzeNonparametric tests2 Independent Samples…

Mann-Whitney Test

11 12.68 139.50

11 10.32 113.50

GROUPHusband

TIMEN Mean Rank Sum of Ranks

Test Statisticsb

47.500

113.500

Mann-Whitney U

Wilcoxon W

Asymp. Sig. (2-tailed)

Exact Sig. [2*(1-tailedSig.)]

Not corrected for ties.a.

Grouping Variable: GROUPb.

*: Type of this variable must be Numeric in SPSS.

Comments on Nonparametric Test for 2 Independent Samples

Robust to outliersOne-sided tests are possibleNonparametric confidence intervals are also available from some software programsFor tied observations, use average rank for each tied observation.Possible Names

Mann-Whitney Rank Sum TestMann-Whitney TestMann-Whitney U TestWilcoxon Rank Sum Test

Testing for a Relationship between Categorical Variables

Large Sample SizeChi-square test

Small Sample SizeChi-square test with Yates’ continuity correction

Fisher’s exact test

Urgent Colonoscopy for the Diagnosis and Treatment of Severe Diverticular Hemorrhage New England Journal of Medicine 2000;342:78-82

Research Hypothesis

Severe Bleeding

Medical and Surgical Treatment

Medical and Colonoscopic Treatment

No 11 10 21

Yes 6 0 6

Total 17 10 27

Fisher’s Exact Test – 5 stepsAssumptions:

Independent, random sample from the populationTwo variables are categorical

Hypotheses: H0: Response and Predictor are Independent

HA: Response and Predictor are Associated

Test Statistic: (p-value)P-value: from the computer!

For this example, p=0.057

Data Entry

Weight the variable: count.DataWeight Cases…

Computer Outputs - FET

Perform FET (or Chi-square test if sample size is large)AnalyzeDescriptive StatisticsCrosstabs…Assign “bleeding” for

“Row(s)”, “treat” for “Column(s)”

Click “Statistics” to check “Chi-

square”

CrosstabsBLEEDING * TREAT Crosstabulation

11 10 21

17 10 27

BLEEDING

Medical andSurgical

Treatment

Medical andColonoscopic

Treatment

Chi-Square Tests

4.538b 1 .033

2.726 1 .099

6.530 1 .011

.057 .042

4.370 1 .037

Pearson Chi-Square

Continuity Correctiona

Likelihood Ratio

Fisher's Exact Test

Linear-by-LinearAssociation

N of Valid Cases

Value dfAsymp. Sig.

(2-sided)Exact Sig.(2-sided)

Exact Sig.(1-sided)

Computed only for a 2x2 tablea.

2 cells (50.0%) have expected count less than 5. The minimum expected count is2.22.

The Inexact Use of Fisher’s Exact Test in Six Major Medical Journals

JAMA 1989;261:3430-3433

Table 1. Specification of Use of Fisher’s Exact Test by Journal

Journal No. of Articles That Specified /

No. of Articles Reviewed

------------------------------------------------------------------------------------------------------

New England Journal of Medicine 8 / 9

Annals of Internal Medicine 2 / 4

British Medical Journal 3 / 6

The Journal of the American 6 / 16

Medical Association

Lancet 4 / 14

American Journal of Medicine 0 / 7

Homework

To be posted, not graded

Solutions will be posted on Monday

Read Chapters 24, 25, 27

Nonparametric Statistics Lecture 9. Small Sample, Non-normal Population If the sample was large, the...

Documents

Transcript of Nonparametric Statistics Lecture 9. Small Sample, Non-normal Population If the sample was large, the...

Chapter 16 Nonparametric Tests - WordPress.com. WILCOXON TWO-SAMPLE TEST 499 16.1 Wilcoxon two-sample test The Wilcoxon test provides a nonparametric alternative to a two-sample t

Is my sample size large enough? Parametric vs ... · 5/14/2015 · Parametric vs nonparametric Stat tests and sample size May 14, 2015 • A test may be parametric or nonparametric

Lecture 5: Estimation · ¥ Basic concepts of estimation ¥ Nonparametric interval estimation (bootstrap) Population Sample Inferential Statistics Descriptive Statistics Probability

Probabilities Probabilities: Bayes' Theorem Bayesian Networks... · – based on Bayes Theorem (see following slides) • They can predict the probability that a particular sample

PROBABILITY AND BAYES THEOREM 1. 2 POPULATION SAMPLE PROBABILITY STATISTICAL INFERENCE.

Sampling Distributions Central Limit Theorem. Objectives Investigate the variability in sample statistics from sample to sample Find measures of central.

Module 9: Nonparametric Tests - Nova Southeastern …apps.fischlerschool.nova.edu/toolbox...Module 9 Overview ! Nonparametric Tests ! Parametric vs. Nonparametric Tests ! Restrictions

Math 3680 Lecture #11 The Central Limit Theorem for Sample Counts and Sample Proportions.

NONPARAMETRIC TEXTURE ANALYSIS W ITH …challenging and generic, three additional samples were generated from each sample: a sample rotated by 90 de-grees, a 64x64 scaled sample obtained

nonparametric lecture.ppt

nonparametric test - pages.stat.wisc.edupages.stat.wisc.edu/~st571-1/Fall2005/lec18-21.2.pdf · Nonparametric Methods for Two Samples Mann-Whitney test (1) Rank the obs rank obs sample

Bayesian Nonparametric Modelling · 2008-04-25 · Outline Bayesian Nonparametric Modelling Gaussian Processes De Finetti’s Theorem Pólya Urn Scheme Dirichlet Processes Representations

Probability and Samples Sampling Distributions Central Limit Theorem Standard Error Probability of Sample Means.

Chapter 17: Nonparametric Statistics. LO1Use both the small-sample and large-sample runs tests to determine whether the order of observations in a sample.

Central Theorem - Purdue University College of Engineeringreibman/ece302/... · Central Limit Theorem cont chapter 7,3 considers the CDI of the sample mean intuition Sample mean has

OTHER DERIATIE APPLICATIONS - Yahoolib.store.yahoo.net/lib/yhst-131101042530630/Sample...OTHER DERIATIE APPLICATIONS 105 Rolle’s Theorem Rolle’s Theorem says that if a function

NONPARAMETRIC AND PARTIALLY NONPARAMETRIC … › researcher › files › us... · 2010-03-30 · NONPARAMETRIC AND PARTIALLY NONPARAMETRIC STATISTICAL INFERENCE IN WIRELESS SENSOR

Nonparametric: K sample 張育慈 2015/03. Treatment 1 ……Treatment i DataRanks……DataRanks.……223. 255.……1862 269.……164.

Sample spaces - notes.eonu.net · Theorem Proof Expected number of successes in Bernoulli trials Theorem Linearity of expectation Theorem Variance Deﬁnition Discrete probability

Nonparametric Hypothesis tests The approach to explore the small-sized sample and the unspecified population.