KmL & KML3D : K- Means FOr Longitudinal Data

49
KML & KML3D: K-MEANS FOR LONGITUDINAL DATA Christophe Genolini Bernard Desgraupes Bruno Falissard

description

KmL & KML3D : K- Means FOr Longitudinal Data. Christophe Genolini Bernard Desgraupes Bruno Falissard. Definition. Two trajectories. TEN trajectories. Two many trajectories. Solution : clusters. Cluster example. how cluster?. Parametric algorithms Non parametric algorithms. - PowerPoint PPT Presentation

Transcript of KmL & KML3D : K- Means FOr Longitudinal Data

Page 1: KmL  & KML3D :  K- Means FOr  Longitudinal Data

KML & KML3D: K-MEANS FOR

LONGITUDINAL DATA

Christophe GenoliniBernard Desgraupes

Bruno Falissard

Page 2: KmL  & KML3D :  K- Means FOr  Longitudinal Data

DEFINITION

Page 3: KmL  & KML3D :  K- Means FOr  Longitudinal Data

TWO TRAJECTORIES

Page 4: KmL  & KML3D :  K- Means FOr  Longitudinal Data

TEN TRAJECTORIES

Page 5: KmL  & KML3D :  K- Means FOr  Longitudinal Data

TWO MANY TRAJECTORIES...

Page 6: KmL  & KML3D :  K- Means FOr  Longitudinal Data

SOLUTION : CLUSTERS

Page 7: KmL  & KML3D :  K- Means FOr  Longitudinal Data

CLUSTER EXAMPLE

Page 8: KmL  & KML3D :  K- Means FOr  Longitudinal Data

HOW CLUSTER? Parametric algorithms

Non parametric algorithms

Page 9: KmL  & KML3D :  K- Means FOr  Longitudinal Data

HOW CLUSTER? Parametric algorithms

Example : proc trajBase on likelihood

Non parametric algorithmsK means (KmL)

Page 10: KmL  & KML3D :  K- Means FOr  Longitudinal Data

I ♥ Quebec…

Page 11: KmL  & KML3D :  K- Means FOr  Longitudinal Data

LIKELIHOOD FOR SIZE

Size = 1,84

Small likelihood Big likelihood

Page 12: KmL  & KML3D :  K- Means FOr  Longitudinal Data

BIG LIKELIHOOD?

Page 13: KmL  & KML3D :  K- Means FOr  Longitudinal Data

PARAMETRIC ALGORITHMS

Number of clusters Trajectories shape (linear, polynomial,…) Distributions of variable (poisson, normal…)

Maximization of the likelihood

Page 14: KmL  & KML3D :  K- Means FOr  Longitudinal Data

NON PARAMETRIC ALGORITHMS

Number of clusters

Maximization of some criteria

Page 15: KmL  & KML3D :  K- Means FOr  Longitudinal Data

K-MEANSKML

Page 16: KmL  & KML3D :  K- Means FOr  Longitudinal Data

K MEANS LONGITUDINAL

Page 17: KmL  & KML3D :  K- Means FOr  Longitudinal Data

K MEANS LONGITUDINAL∆ +

3.4 4.2

1.7 2.3

0.65 1.2

3.1 2.3

3.9 3.2

Page 18: KmL  & KML3D :  K- Means FOr  Longitudinal Data

K MEANS LONGITUDINAL∆ +

1.6 6.8

0.36 5.1

1.3 4

4.9 0.6

5.7 0.6

Page 19: KmL  & KML3D :  K- Means FOr  Longitudinal Data

K MEANS LONGITUDINAL

Page 20: KmL  & KML3D :  K- Means FOr  Longitudinal Data

EXAMPLE

> kml(cld3,4,1,print.traj=TRUE)

Page 21: KmL  & KML3D :  K- Means FOr  Longitudinal Data

STRENGTH: MISSING VALUES

Page 22: KmL  & KML3D :  K- Means FOr  Longitudinal Data

WEAKNESS: LOCAL MAXIMUM

Page 23: KmL  & KML3D :  K- Means FOr  Longitudinal Data

SOLUTION: RE-RUNNING

Page 24: KmL  & KML3D :  K- Means FOr  Longitudinal Data

PROBLEM: NUMBER OF CLUSTERS

Page 25: KmL  & KML3D :  K- Means FOr  Longitudinal Data

EXAMPLE longData <- as.cld(gald())

kml(longData,2:5,10,print.traj=TRUE)

choice(longData)

Page 26: KmL  & KML3D :  K- Means FOr  Longitudinal Data

KML3D

Page 27: KmL  & KML3D :  K- Means FOr  Longitudinal Data

JOINT TRAJECTORIES

Page 28: KmL  & KML3D :  K- Means FOr  Longitudinal Data

JOINT TRAJECTORIES

Page 29: KmL  & KML3D :  K- Means FOr  Longitudinal Data

SOLUTION: CLUSTER C1: partition for V1 C2: partition for V2

C1xC2: partition for joint trajectories?

C1 = {small,medium,big}C2 = {blue,red}

C1xC2 = {small blue, small red, medium blue, medium red, big blue, big red}

Page 30: KmL  & KML3D :  K- Means FOr  Longitudinal Data

PROBLEM

Page 31: KmL  & KML3D :  K- Means FOr  Longitudinal Data

PROBLEM

Page 32: KmL  & KML3D :  K- Means FOr  Longitudinal Data

PROBLEM

Page 33: KmL  & KML3D :  K- Means FOr  Longitudinal Data

PROBLEM

Page 34: KmL  & KML3D :  K- Means FOr  Longitudinal Data

PROBLEM

Page 35: KmL  & KML3D :  K- Means FOr  Longitudinal Data

PROBLEM

Page 36: KmL  & KML3D :  K- Means FOr  Longitudinal Data

PROBLEM

Page 37: KmL  & KML3D :  K- Means FOr  Longitudinal Data

SOLUTION: THIRD DIMENSION

Page 38: KmL  & KML3D :  K- Means FOr  Longitudinal Data

SOLUTION: THIRD DIMENSION

par(mfrow=c(1,2))a <- c(1,2,1,3,2,3,3,4,5,3,5)b <- c(6,6,6,5,6,6,5,5,4,3,3)plot(a,type="l",ylim=c(0,10),xlab="First variable",ylab="")plot(b,type="l",ylim=c(0,10),xlab="Second variable",ylab="")

points3d(1:11,a,b)axes3d(c("x", "y", "z"))title3d(, , "Time","First variable","Second variable")box3d()aspect3d(c(2, 1, 1))rgl.viewpoint(0, -90, zoom = 1.2)

Page 39: KmL  & KML3D :  K- Means FOr  Longitudinal Data

CLUSTER IN 3D

cl <- gald(functionClusters=list(function(t){c(-4,-4)},function(t){c(5,0)},function(t){c(0,5)}),functionNoise = function(t){c(rnorm(1,0,2),rnorm(1,0,2))})plot3d(cl)

kml(cl,3,1,paramKml=parKml(startingCond="randomAll"))plot3d(cl,paramTraj=parTraj(col="clusters"))

Page 40: KmL  & KML3D :  K- Means FOr  Longitudinal Data

PERSPECTIVES

Page 41: KmL  & KML3D :  K- Means FOr  Longitudinal Data

AWARD: BEST “NUMBER OF CLUSTERS” FINDER…

The nominees are:Calinsky & HarabatzRay & TurieDavies & Bouldin ...

The winner is…

Page 42: KmL  & KML3D :  K- Means FOr  Longitudinal Data

AWARD: BEST “NUMBER OF CLUSTERS” FINDER…

The nominees are:Calinsky & HarabatzRay & TurieDavies & Bouldin ...

The winner is…Falissard & Genolini (or G & F ?)

Page 43: KmL  & KML3D :  K- Means FOr  Longitudinal Data

PERSPECTIVE : SHAPE DISTANCE

Page 44: KmL  & KML3D :  K- Means FOr  Longitudinal Data

PERSPECTIVE : CLUSTER ACCORDING TO SHAPE« classic » distance

« shape » distance

Page 45: KmL  & KML3D :  K- Means FOr  Longitudinal Data

IMPUTATION

Page 46: KmL  & KML3D :  K- Means FOr  Longitudinal Data

IMPUTATION

Page 47: KmL  & KML3D :  K- Means FOr  Longitudinal Data

IMPUTATION

Page 48: KmL  & KML3D :  K- Means FOr  Longitudinal Data

IMPUTATION

Page 49: KmL  & KML3D :  K- Means FOr  Longitudinal Data

THANK YOU!