Mann$Kendall)trend)test Resampling)tests:)) · 2015. 2. 19. ·...

11
W4Lecture 1: Useful nonparametric tests for trends and distribu9on differences between the two datasets MannKendall trend test Resampling tests: Permuta5on Bootstrap, onesample, twosample Mul5plicity and “field significance”

Transcript of Mann$Kendall)trend)test Resampling)tests:)) · 2015. 2. 19. ·...

Page 1: Mann$Kendall)trend)test Resampling)tests:)) · 2015. 2. 19. · For)datasets)with)larger)samples)(>20)for)each) data)) • Null)hypothesis:)Both)dataA)with)n1)samples)and)dataB)with)n2)samples)are)drawn)

W4-­‐Lecture  1:  Useful  non-­‐parametric  tests  for  trends  and  distribu9on  differences  between  the  two  datasets  

•  Mann-­‐Kendall  trend  test  •  Resampling  tests:    – Permuta5on  – Bootstrap,  one-­‐sample,  two-­‐sample  

•  Mul5plicity  and  “field  significance”  

Page 2: Mann$Kendall)trend)test Resampling)tests:)) · 2015. 2. 19. · For)datasets)with)larger)samples)(>20)for)each) data)) • Null)hypothesis:)Both)dataA)with)n1)samples)and)dataB)with)n2)samples)are)drawn)

Non-­‐parametric  test  for  loca1on:  Wilcoon-­‐Mann-­‐Whitney  or  rank  sum  test:  

•  How  do  we  compare  the  means  or  median  of  the  two  datasets  if  they  are  clearly  non-­‐Gaussian  with  unknown  distribu5on  and  few  wild  values?  

•  We  pool  both  datasets  into  one  data  (dataAB)  and  compare  the  rank  of  dataA  with  that  of  the  data  B  in  dataAB.    For  the  null  hypothesis  that  dataA  and  dataB  are  draw  from  the  same  sta5s5cal  distribu5on,  the  sum  of  the  dataA  rank  should  have  good  probability  to  agree  with  the  mean  sum  of  the  dataAB  rank  (Wilcoon-­‐Mann-­‐Whitney  or  rank  sum  test),  or  no  different  from  the  sum  of  dataB  rank  (Wilcoxon  signed-­‐rank  test).  

•  These  tests  are  much  more  powerful  than  one-­‐sample  or  two-­‐sample  t-­‐tests.    

Page 3: Mann$Kendall)trend)test Resampling)tests:)) · 2015. 2. 19. · For)datasets)with)larger)samples)(>20)for)each) data)) • Null)hypothesis:)Both)dataA)with)n1)samples)and)dataB)with)n2)samples)are)drawn)

Wilcox-­‐Mann-­‐Whitney  Test  for  data  with  small  sample  sizes  (<20)  

When  the  sample  numbers  of  dataA  and  data  B  are  small  (<20),  one  can  directly  compare  the  U  value  of  each  data  to  the  cri5cal  U-­‐value  in  the  U-­‐value  Table  for  the  same  sample  numbers.  A  very  good  example  is  given  at  hUp://sphweb.bumc.bu.edu/otlt/MPH-­‐Modules/BS/BS704_Nonparametric/BS704_Nonparametric4.html  

 The  null  hypothesis  is  there  is  no  difference  in  data  distribu5on  between  the  two  datasets  (Ho).    If  U  value  of  one  data  is  smaller  than  the  cri5cal  value,  one  can  reject  Ho  because  that  data  have  been  systema5cally  ranked  smaller  than  the  other  data.    If  the  rank  is  larger  than  the  U  value,  Ho  cannot  be  rejected.  

Page 4: Mann$Kendall)trend)test Resampling)tests:)) · 2015. 2. 19. · For)datasets)with)larger)samples)(>20)for)each) data)) • Null)hypothesis:)Both)dataA)with)n1)samples)and)dataB)with)n2)samples)are)drawn)

For  datasets  with  larger  samples  (>20  for  each  data)  •  Null  hypothesis:  Both  dataA  with  n1  samples  and  dataB  with  n2  samples  are  drawn  

from  the  same  sta5s5cal  distribu5on,  the  probability  for  the  rank  of  dataA  equals  to  all  other  possible  ranks  for  n1  samples  randomly  drawn  from  the  pooled  dataAB  with  total  samples  of  n=n1+n2.    Thus,  there  is  a  total  number  of  (n!)/(n1!)(n2!)  possible  ranks.    The  distribu5on  of  the  sums  of  these  possible  ranks  follows  Gaussian  distribu5on  if  n1  and  n2  >  10.  

•  Based  on  the  Mann-­‐Whitney  U-­‐sta5s5c,  the  mean  and  variance  of  the  distribu5on  of  the  (n!)/(n1!)(n2!)  possible  sums  of  the  ranks  for  n1  or  n2  samples  are  Under the U-statistic, we can compute the rank-sum, R, statistics for dataA and dataB

U1 = R1 −n1

2(n1 +1), U2 = R2 −

n2

2(n2 +1)

and compare to that of pooled data from dataA and dataB

µU =n1n2

2, σU =

n1n2 (n1 + n2 +1)12

if data have a few repeating values

If the data have large number of repeating values:

σU =n1n2 (n1 + n2 +1)

12−

n1n2

12(n1 + n2 )(n1 + n2 +1)(t3

j − t j )j

J

If n1 or n2 is less than 20, check whether U=min(U1, U2 )Based on Gaussian distribution

z=U1 −µUσU

use z-value to determine whether the null hypothesis can be rejected.

Page 5: Mann$Kendall)trend)test Resampling)tests:)) · 2015. 2. 19. · For)datasets)with)larger)samples)(>20)for)each) data)) • Null)hypothesis:)Both)dataA)with)n1)samples)and)dataB)with)n2)samples)are)drawn)

Example-­‐1:  

•  Determine  whether  the  values  in  dataA  (1,  3,  20,  5,  11)  are  significantly  different  from  those  of  dataB  (2,  5,  6,  7,  15,  17).  

•  Null  hypothesis:  DataA  are  draw  from  the  same  distribu5on  of  merged  DataAB  

•  The  U-­‐value  for  smaller  R  data  (dataA)  is  >  0  (the  cri5cal  U-­‐value  for  0.01  or  0.05  significance  for  n1=5).    The  null  hypothesis  cannot  be  rejected  at  0.05  or  0.01  significance.  

DataA rank dataB rank DataAB rankedAB rank+of+AB1 1 2 2 1 1 13 3 5 4 3 2 220 10 6 5 20 3 35 4 7 6 5 5 415 7.5 15 7.5 11 6 5

17 9 2 7 615 15 7.517 15 7.55 17 9

n1=3 n2=5 6 20 10rank+sum,+R 25.5 33.5 7

u 10.5

Page 6: Mann$Kendall)trend)test Resampling)tests:)) · 2015. 2. 19. · For)datasets)with)larger)samples)(>20)for)each) data)) • Null)hypothesis:)Both)dataA)with)n1)samples)and)dataB)with)n2)samples)are)drawn)

•  If  I  plot  all  the  possible  U-­‐values  for  all  the  probable  dataA  values  randomly  drawn  from  the  dataAB,  they  would  follow  Gaussian  distribu5on  (of  course,  this  assump5on  is  only  valid  for  moderate  to  large  samples,  i.e.,  n1,  n2>10)  

Rank-­‐sum  values  

Prob

ability  

µU =n1n2

2=

5X62

=15

σU =n1n2 (n1 + n2 +1)

12=

5X6(5+ 6+1)12

= 5.5

zDataA =U1 −µUσU

=10.5−15

5.5= −0.82, The probability for this Z value is 79%

The null hypothesis cannot be rejected.

Page 7: Mann$Kendall)trend)test Resampling)tests:)) · 2015. 2. 19. · For)datasets)with)larger)samples)(>20)for)each) data)) • Null)hypothesis:)Both)dataA)with)n1)samples)and)dataB)with)n2)samples)are)drawn)

Example:  

Evaluate  whether  cloud  seeds  can  alter  lightning  strikes  

Page 8: Mann$Kendall)trend)test Resampling)tests:)) · 2015. 2. 19. · For)datasets)with)larger)samples)(>20)for)each) data)) • Null)hypothesis:)Both)dataA)with)n1)samples)and)dataB)with)n2)samples)are)drawn)

•  Null  hypothesis:  no  •  Effect.  

U1 = R1 −n1

2(n1 +1) =108.5− 6(12+1) = 30.5

pooled data from dataA and dataB

µU =n1n2

2= (12)(11) / 2 = 66

σU =n1n2 (n1 + n2 +1)

12= [ (12)(11)(12+11+1) /12] =16.2

Based on Gaussian distribution

z=U1 −µUσU

= (30.5− 66) /16.2 = −2.19

p value: 0.014, ~1.4% of the 1352,078 possible values of U1 under null hypothesis are smaller than the observed U1. Null hypothesis canbe rejected.

Page 9: Mann$Kendall)trend)test Resampling)tests:)) · 2015. 2. 19. · For)datasets)with)larger)samples)(>20)for)each) data)) • Null)hypothesis:)Both)dataA)with)n1)samples)and)dataB)with)n2)samples)are)drawn)

The  bootstrap  Why?  •   When  we  only  have  one  sample  data,  X  with  xi,  where  

i=1,  …n,  we  need  to  determine  its  sta5s5cal  distribu5on  to  assess  uncertainty.  

 How  to  we  construct  the  PD  of  this  data  sample?  •  We  write  each  of  the  n  data  value  on  a  paper  slip,  and  

put  all  n  number  of  slips  into  a  hat,  then  randomly  draw  one  slip  and  record  the    value  as  x*1.      

•  We  put  all  the  slips  back  to  the  hat  and  mixed,  then  draw  the  2nd  slip,  record  it’s  value,  as  x*2.  

•  We  repeat  this  process  n  5mes  to  generated  a  new  data,  X*1=    x*1,1,  x*1,2,  ….  x*1,n.NOTICE  THAT  x*1,1  and  x*1,2  can  be  the  same  value  because  a  data  value  can  be  drawn  from  the  hat  more  than  1  5mes.  

•  We  can  repeat  the  above  process  to  generate  the  second  new  data,  ,  X*2=    x*2,1,    x*2,2,  ….  x*2,n.  

•  We  can  repeat  this  process  by  computer  many  5mes,  say  nB=10,000  or  more,  to  generate  nB  number  of  new  data,  X*,  each  has  the  same  n  number  of  values  as  the  original  data  X.    

•  Then,  the  sta5s5cs  of  interests,  say,  mean,  is  computed  for  each  of  the  nB  generated  bootstrap  samples,  X*j,  where  j=1,  2,  ….  nb.    The  resultant  frequency  distribu5on  is  then  used  to  approximate  the  true  sampling  distribu5on.  

 

5%   95%  

Page 10: Mann$Kendall)trend)test Resampling)tests:)) · 2015. 2. 19. · For)datasets)with)larger)samples)(>20)for)each) data)) • Null)hypothesis:)Both)dataA)with)n1)samples)and)dataB)with)n2)samples)are)drawn)

Example:  

   We’d  like  to  calculate  the  mean  of  a  data,  X=1,  2,  3,  4,  5,  2  with  n=6  values.  We  do  not  know  it’s  PDF,  but  need  to  es5mate  the  uncertainty  of  the  mean.      

 Using  the  bootstrap  approach,  we  can  generate  a  set  of  new  X*j,  where  j=1,  2,  …  20,  as  shown  below.  

0  

0.05  

0.1  

0.15  

0.2  

1.50   2.00   2.50   3.00   3.50   4.00  

meanoriginal*data,*X 1 2 3 4 5 2 2.83 2.00 1

2.17 2bootstrap*generated 2.17 2data,*X*1 2 5 1 3 2 2 2.50 2.33 4X*2 1 5 3 3 1 5 3.00 2.33 4X*3 1 4 1 3 1 4 2.33 2.50 3X*4 1 3 1 2 1 4 2.00 2.50 3X*5 1 3 1 2 5 4 2.67 2.50 1X*6 1 3 1 2 3 4 2.33 2.50 1X*7 5 1 4 5 2 4 3.5 2.67X*8 3 4 2 1 1 2 2.17 2.67X*9 1 3 2 4 5 1 2.67 2.67X*10 3 1 2 1 4 2 2.17 2.67*X*11 2 5 1 3 2 2 2.5 2.83X*12 1 5 3 3 1 5 3.00 2.83X*13 4 3 1 5 4 3 3.33 2.83X*14 5 2 4 2 1 2 2.67 3.00X*15 1 3 5 3 2 4 3.00 3.00X*16 2 4 1 2 3 4 2.67 3.00X*17 5 1 2 4 2 1 2.50 3.33X*18 3 4 2 1 5 2 2.83 3.50X*19 1 3 2 4 5 2 2.83X*20 3 1 4 1 4 2 2.50

21

0*

0.02*

0.04*

0.06*

0.08*

0.1*

0.12*

0.14*

0.16*

0.18*

0.2*

0.00* 1.00* 2.00* 3.00* 4.00*

Series1*

The  mean  of  original  data  is  2.83,  the  uncertainty  range  of  the  mean  for  the  90%  confident  is    [2.0,  3.5].  

Page 11: Mann$Kendall)trend)test Resampling)tests:)) · 2015. 2. 19. · For)datasets)with)larger)samples)(>20)for)each) data)) • Null)hypothesis:)Both)dataA)with)n1)samples)and)dataB)with)n2)samples)are)drawn)

Summary  

•  Wilcoon-­‐Mann-­‐Whitney  or  rank  sum  test:  An  effec1ve  test  to  comparing  two  datasets  with  unknown  distribu1on,  by  either  comparing  the  sum  of  their  rankings  to  the  U-­‐value  (if  the  samples  of  the  data  <20)  or  to  compute  z-­‐value  of  the  U  value  of  one  data  with  that  of  the  data  pooled  from  the  two  datasets  we  compare  with.  

•  Boots  strap  approach:  Assume  the  data  we  have  is  just  a  randomly  drawing  from  the  sample  pool  represented  by  the  data.    One  can  randomly  draw  samples  from  this  pool  many  1mes  (say  1000,  or  10000)  to  determine  the  PDF  of  the  sample  property  (e.g.,  Mean)  and  their  range  of  the  uncertainty.