Andrew Smith

28
Andrew Smith Describing childhood diet with cluster analysis 6th September 2012

description

Andrew Smith. Describing childhood diet with cluster analysis 6th September 2012. Describing diet with cluster analysis. Kate Northstone Pauline Emmett PK Newby World Cancer Research Fund MRC, Wellcome Trust, University of Bristol. Describing diet with cluster analysis. Outline. - PowerPoint PPT Presentation

Transcript of Andrew Smith

Page 1: Andrew Smith

Andrew Smith

Describing childhood diet with cluster analysis6th September 2012

Page 2: Andrew Smith

Describing diet with cluster analysis

• Kate Northstone

• Pauline Emmett

• PK Newby

• World Cancer Research Fund

• MRC, Wellcome Trust, University of Bristol

2

Page 3: Andrew Smith

Describing diet with cluster analysis3

Page 4: Andrew Smith

Outline

• Introductions• ALSPAC• Food frequency questionnaires / diet diaries• Dietary patterns• Cluster analysis

• k-means cluster analysis

• Results• 4 cluster solution• Associations with socio-demographic variables

4

Page 5: Andrew Smith

ALSPAC

• Avon Longitudinal Study of Parents and Children

• Birth cohort study

• 14,541 pregnant women and their children

• www.bris.ac.uk/alspac

5

Page 6: Andrew Smith

Food frequency questionnaires6

Page 7: Andrew Smith

Diet diaries

• Records all food and drink consumed over 3 day

period

• 2 weekdays and 1 weekend day

• Parent completes age 7

• Child completes age 10 and 13

7

Page 8: Andrew Smith

Dietary patterns

• Examine diet as a whole

• Start with many variables

(food group intakes)

• Express as a small number of

variables

Image: Paul / FreeDigitalPhotos.net

8

Page 9: Andrew Smith

Principal components analysis (PCA)

• Examine diet as a whole

• Start with many variables

• Use correlations between foods

• Express as a small number of

components

Image: Paul / FreeDigitalPhotos.net

9

Page 10: Andrew Smith

Cluster analysis

• Examine diet as a whole

• Start with many variables

• Use similarities between people

• Express as a small number of

clusters

Image: Paul / FreeDigitalPhotos.net

10

Page 11: Andrew Smith

Cluster analysis

• Separate subjects into

non-overlapping

groups

• Based on ‘distances’

between individuals

• Unsupervised learning

11

Image: Boaz Yiftach / FreeDigitalPhotos.net

Page 12: Andrew Smith

k-means cluster analysis

• Most widely used for dietary patterns

• Number of clusters, k, is specified beforehand

• Minimises – Distance from each subject to his/her cluster

mean– Summed over all subjects in that cluster– Summed over all clusters

12

Page 13: Andrew Smith

k-means cluster analysis13

Page 14: Andrew Smith

Problems with the standard algorithm

The algorithm for k-means cluster analysis is:

• Short-sighted

• Tends to find solutions that are at a local minimum– So run algorithm 100 times and choose solution

that is minimum out of all minima

14

Page 15: Andrew Smith

Standardising the input variables15

Page 16: Andrew Smith

Reliability of the cluster solution

• Split sample in half

• Perform separate analyses on each half

• See how many children change clusters

• Repeat 5 times– 32 out of 8,279 children changed cluster (0.4%)

16

Page 17: Andrew Smith

Results

• Food frequency questionnaire (FFQ) data– Age 7– 3 clusters

• Diet diary data– Age 7, 10 and 13– 4 clusters

17

Page 18: Andrew Smith

Processed30.2% of children18

Image: Suat Eman, artemisphoto, -Marcus- / FreeDigitalPhotos.net

Page 19: Andrew Smith

27.8% of childrenPlant-based (Healthy)19

Image: Suat Eman, Paul, Rob Wiltshire, Simon Howden, winnond / FreeDigitalPhotos.net

Page 20: Andrew Smith

Traditional British21.3% of children20

Image: Suat Eman, Maggie Smith, Simon Howden / FreeDigitalPhotos.net

Page 21: Andrew Smith

Packed Lunch20.6% of children21

Image: Grant Cochrane, luigi diamanti, Rawich, Master Isolated Images / FreeDigitalPhotos.net

Page 22: Andrew Smith

Associations with socio-demographic vars

Processed

Plant-based

Plant-based

Traditional British

Traditional British

Processed

Girls 3,115 1 1 1

Boys 2,941 0.82 (0.72, 0.93)

1.03(0.89, 1.20)

1.18 (1.04, 1.34)

22

Page 23: Andrew Smith

Associations with socio-demographic vars

Maternal age

Processed

Plant-based

Plant-based

Traditional British

Traditional British

Processed

< 21 130 1 1 1

21-25 994 0.59 (0.33, 1.07)

1.07 (0.56, 2.05)

1.57(1.02, 2.43)

26-30 2,644 0.52(0.29, 0.92)

1.20(0.64, 2.28)

1.60(1.04, 2.46)

31+ 2,288 0.37(0.21, 0.67)

1.50(0.79, 2.88)

1.77(1.13, 2.76)

23

Page 24: Andrew Smith

Associations with socio-demographic vars

Maternal education

Processed

Plant-based

Plant-based

Traditional British

Traditional British

Processed

CSE 740 1 1 1

Vocational 504 0.84(0.60, 1.17)

1.19(0.82, 1.72)

1.01(0.76, 1.32)

O level 2,163 0.65(0.51, 0.83)

1.46(1.10, 1.94)

1.05(0.86, 1.30)

A level 1,604 0.42(0.33, 0.55)

2.01(1.50, 2.69)

1.18(0.95, 1.48)

Degree 1,045 0.30(0.23, 0.39)

2.75(2.00, 3.76)

1.22(0.94, 1.57)

24

Page 25: Andrew Smith

Associations with socio-demographic vars

Siblings

Processed

Plant-based

Plant-based

Traditional British

Traditional British

Processed

0 older 2,755 1 1 1

1 older 2,317 1.21(1.03, 1.42)

1.12 (0.94, 1.36)

0.73(0.62, 0.86)

2+ older 984 1.58(1.28, 1.97)

0.99(0.76, 1.27)

0.64(0.52, 0.80)

25

Page 26: Andrew Smith

Associations with socio-demographic vars

Siblings

Processed

Plant-based

Plant-based

Traditional British

Traditional British

Processed

0 younger 2,946 1 1 1

1 younger 2,490 1.01(0.86, 1.19)

0.58(0.48, 0.71)

1.69(1.44, 1.99)

2+ younger 620 1.21(0.92, 1.57)

0.43(0.33, 0.58)

1.90(2.50, 2.40)

26

Page 27: Andrew Smith

Summary

• Multivariate methods to compress dietary data into

dietary patterns

• k-means cluster analysis is widespread but must

be applied carefully

• 3 clusters in FFQ data (Processed, Plant-based

and Traditional British)

• 4 clusters in diet diary data ( + Packed Lunch)

27

Page 28: Andrew Smith

References

• Northstone, AS et al. (2012) ‘Longitudinal

comparisons of dietary patterns derived by cluster

analysis in 7 to 13 year old children’ British Journal

of Nutrition to appear.

• AS et al. (2011) ‘A comparison of dietary patterns

derived by cluster and principal components

analysis in a UK cohort of children.’ European

Journal of Clinical Nutrition 65, p1102-9.

28