Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345)...
Transcript of Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345)...
![Page 1: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/1.jpg)
Aula 22 Goodies*
* Goodies related to animals, plants and numbers…
![Page 2: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/2.jpg)
https://kiirstio.wixsite.com/kowen/post/the-25-days-of-christmas-an-r-advent-calendar
![Page 3: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/3.jpg)
![Page 5: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/5.jpg)
https://www.nature.com/articles/d41586-019-03595-0
![Page 6: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/6.jpg)
https://hackernoon.com/machine-learning-basics-its-your-cup-of-tea-af4baf060ace
![Page 7: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/7.jpg)
https://blog.revolutionanalytics.com/2013/12/k-means-clustering-86-single-malt-scotch-whiskies.html
![Page 8: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/8.jpg)
![Page 9: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/9.jpg)
![Page 10: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/10.jpg)
Ecologia Numérica - Aula Teórica 22 – 02-12-2019
https://www.azquotes.com/quote/298634
Actually in itself a lie, because Mark Twain himself credited it to Disraeli (UK prime minister), but there’s no written record of the statement.
![Page 11: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/11.jpg)
Dados de abundâncias de
espécies de peixes em 20
estações de amostragem
no estuário do Sado.
Setúbal
5 km
N
Oc
ean
o A
tlâ
nti
co
8º55' 8º50' 8º45'
38º26'
38º30'
38º28'
13
2
46
79
12
1413
11
5 810
15
16
17
18
20
19
Exemplo:
agrupamento
![Page 12: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/12.jpg)
classificação
Complete Linkage
Euclidean distances
0 5 10 15 20 25 30
Linkage Distance
20
19
18
9
17
15
13
16
14
12
10
11
8
7
6
5
4
3
2
1
Resultados de uma análise classificativa
![Page 13: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/13.jpg)
Single Linkage
Euclidean distances
0 5 10 15 20
Linkage Distance
20
19
18
9
5
11
13
12
16
14
17
15
10
8
7
6
4
3
2
1
agrupamento
![Page 14: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/14.jpg)
Complete Linkage
Euclidean distances
0 5 10 15 20 25 30
Linkage Distance
20
19
18
9
17
15
13
16
14
12
10
11
8
7
6
5
4
3
2
1
agrupamento
![Page 15: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/15.jpg)
Weighted pair-group average
Euclidean distances
0 5 10 15 20 25 30
Linkage Distance
20
19
18
9
11
13
12
16
14
17
15
10
8
7
6
5
4
3
2
1
agrupamento
![Page 16: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/16.jpg)
Weighted pair-group centroid (median)
Euclidean distances
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Order of Amalgamation (distances are non-monotonic
20
19
18
2
1
9
4
3
5
13
11
6
8
7
12
16
10
14
17
15
agrupamento
![Page 17: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/17.jpg)
Ward`s method
Euclidean distances
0 10 20 30 40 50 60 70
Linkage Distance
17
15
13
16
14
12
10
9
11
8
7
6
20
19
18
5
4
3
2
1
agrupamento
![Page 18: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/18.jpg)
Ward`s method
City-block (Manhattan) distances
0 20 40 60 80 100 120
Linkage Distance
20
19
18
11
16
12
13
17
15
14
10
9
8
7
6
5
4
3
2
1
agrupamento
![Page 19: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/19.jpg)
Ward`s method
1-Pearson r
0,0 0,5 1,0 1,5 2,0 2,5
Linkage Distance
19
18
9
11
10
8
7
6
5
4
3
20
17
15
13
14
16
2
12
1
agrupamento
![Page 20: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/20.jpg)
• Não existe nenhuma regra para a selecção do nº de grupos a
considerar;
• Devemos procurar que os grupos sejam bem diferenciados;
• A interpretação é feita com recurso aos atributos dos
elementos constituintes dos vários grupos, quer numa
abordagem exploratória quer confirmatória;
• Recurso a estatísticas descritivas.
Número de grupos e interpretação dos dendrogramas
agrupamento
![Page 21: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/21.jpg)
Complete Linkage
Euclidean distances
0 5 10 15 20 25 30
Linkage Distance
20
19
18
9
17
15
13
16
14
12
10
11
8
7
6
5
4
3
2
1
Número de grupos
agrupamento
![Page 22: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/22.jpg)
Complete Linkage
Euclidean distances
0 5 10 15 20 25 30
Linkage Distance
20
19
18
9
17
15
13
16
14
12
10
11
8
7
6
5
4
3
2
1
Número de grupos
agrupamento
![Page 23: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/23.jpg)
Complete Linkage
Euclidean distances
0 5 10 15 20 25 30
Linkage Distance
20
19
18
9
17
15
13
16
14
12
10
11
8
7
6
5
4
3
2
1
Número de grupos
agrupamento
![Page 24: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/24.jpg)
Complete Linkage
Euclidean distances
0 5 10 15 20 25 30
Linkage Distance
20
19
18
9
17
15
13
16
14
12
10
11
8
7
6
5
4
3
2
1
N.º de grupos
agrupamento
![Page 25: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/25.jpg)
1. Olhar (com olhos de ver!) para os dados
2. Transformar os dados?
3. Escolher a distância (distância vs. associação)
4. Escolher o método de agrupamento
5. Validar e interpretar os resultados
Clustering steps
Variáveis binárias(presença ausência)
![Page 26: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/26.jpg)
UPGMA
hclust(distancias,method=“average")
Talvêz o mais famoso método de clustering…
![Page 27: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/27.jpg)
agrupamento
![Page 28: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/28.jpg)
• É um procedimento importante, embora muitas vezes
negligenciado;
• A principal metodologia consiste em determinar uma medida
de concordância entre o resultado final (dendrograma) e a
matriz de semelhança/dissemelhança inicial;
• Coeficiente de correlação cofenética.
Validação dos grupos
agrupamento
![Page 29: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/29.jpg)
![Page 30: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/30.jpg)
Matriz inicialMatriz de semelhança/
dissemelhançaDendrograma
Matriz de semelhança/dissemelhança cofenética
Coeficiente de correlação cofenética
Validação dos grupos
agrupamento
![Page 31: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/31.jpg)
habitats <- read.csv("DataTP9habitats123.csv", sep=";")library(cluster)teste<-hclust(dist(habitats[,-1],method="manhattan"),method="average")par(mfrow=c(1,1))plot(teste)
A primeira coluna são os labels dos sitios
![Page 32: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/32.jpg)
![Page 33: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/33.jpg)
https://stats.stackexchange.com/questions/149852/validate-dendrogram-in-cluster-analysis-what-is-the-meaning-of-cophenetic-corre
matriz de distâncias originais matriz de distâncias cofenéticas
Coeficiente de correlação cofenética
![Page 34: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/34.jpg)
Transformar osdados pode ser fundamental!
set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5)#make 1 species really abundantabund[,1]=c(1000,200,200,10,10)#and one with the inverse pattern, but less abundantabund[,2]=c(10,10,200,200,1000)/10#get distance matrixda=dist(abund)hcdaC=hclust(da,method="complete")#get distance matrix over scaled datasda=dist(scale(abund))hcsdaC=hclust(sda,method="complete")par(mfrow=c(1,2))plot(hcdaC)plot(hcsdaC)
scaledunscaled
![Page 35: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/35.jpg)
![Page 36: Aula 22 Goodies - ULisboa · Transformar os dados pode ser fundamental! set.seed(12345) abund=matrix(rpois(75,lambda=8),ncol=15,nrow=5) #make 1 species really abundant abund[,1]=c(1000,200,200,10,10)](https://reader034.fdocuments.in/reader034/viewer/2022042216/5ebe9cfe2b5f0950f5101283/html5/thumbnails/36.jpg)