Bayesian dynamic modeling of latent trait distributions

17
Bayesian dynamic modeling of latent trait distributions Duke University Machine Learning Group Presented by Kai Ni Jan. 25, 2007 Paper by David B. Dunson, Biostatistics, 2006

description

Bayesian dynamic modeling of latent trait distributions. Paper by David B. Dunson, Biostatistics , 2006. Duke University Machine Learning Group Presented by Kai Ni Jan. 25, 2007. Outline. Introduction Measurement model Dynamic mixture of Dirichlet processes Inference - PowerPoint PPT Presentation

Transcript of Bayesian dynamic modeling of latent trait distributions

Page 1: Bayesian dynamic modeling of latent trait distributions

Bayesian dynamic modeling of latent trait distributions

Duke University Machine Learning Group

Presented by Kai Ni

Jan. 25, 2007

Paper by David B. Dunson,

Biostatistics, 2006

Page 2: Bayesian dynamic modeling of latent trait distributions

Outline

• Introduction

• Measurement model

• Dynamic mixture of Dirichlet processes

• Inference

• Results & Conclusion

Page 3: Bayesian dynamic modeling of latent trait distributions

Motivation

• The general problem – The primary response variable of interest cannot be

measured directly and one must rely on multiple surrogates.

– The different measured outcomes are assumed to e manifestations of a latent variable, which may depend on covaraites.

• Example – Cannot measure the frequency of DNA strand break but can use gel electrophoresis to get surrogates. The distribution of DNA damage across cells may have different shapes depending on the level of oxidative stress.

Page 4: Bayesian dynamic modeling of latent trait distributions

Motivation (2)

• The paper focus on developing an approach for assessing dynamic changes in the latent response distribution across levels of a predictor.

• Dynamic mixture of Dirichlet processes (DMDP) – The latent response distribution in group h is represented as a mixture of the distribution in group h-1 and an unknown innovation distribution, which is assigned a DP prior.

Page 5: Bayesian dynamic modeling of latent trait distributions

Measurement Model

• Let yhi = (yhi1,…,yhip)’ denote a p x 1 vector of surrogate measurements for the latent response of the ith (i = 1,…,nh) subject in group h (h = 1,…,d).

• For example, in the DNA damage study, yhi denotes surrogates of DNA damage for the ith cell in dose group h.

• The yhi has both continuous and categorical elements. Use some mapping function to get an underlying continuous variables yhi

*.

Page 6: Bayesian dynamic modeling of latent trait distributions

Measurement Model (2)

• Relate the underlying continuous variables to the latent response through a measurement model:

– Latent variable– Intercept parameters– Factor loadings– Measurement errors

A scale mixture of normal distribution is assumed for the residual distribution.

• The primary goal is to assess how the latent response distribution changes between groups.

Page 7: Bayesian dynamic modeling of latent trait distributions

Dynamic mixture of Dirichlet process

• First the latent response distribution for group 1 is assumed to be drawn from a DP:

and the predictive density of latent response for group 1 is:

• Assume the distribution G2 for group 2 shares features with G1 but that innovation may have occurred. So G2 =

• G2 is randomly modified from G1 by (1) reducing the probabilities allocated to the atoms in G1 by a factor (1- ) and (2) incorporating new atoms drawn from the base 1

01H

Page 8: Bayesian dynamic modeling of latent trait distributions

DMDP

• The difference between G1 and G2 has mean and variance

• The hyperparameters control the magnitude of the expected changed from G1 to G2.

1 1 01, and H

2 1 1

2 1 1 1 1 01

2 1 1 1 1 1

as 0

{ | , , } as 1

{ ( ) | , , } 0 as or 0

G G

E G G G H G

V B G

Page 9: Bayesian dynamic modeling of latent trait distributions

DMDP

Page 10: Bayesian dynamic modeling of latent trait distributions

Correlation

• For the special case in which for all l, so that the same base distribution is chosen for each component in the mixture. The correlation between consecutive G’s is

• The prior probability of clustering together two subjects h, i and h’, i’ in the same or different groups is

• For the hyperparameters, beta distribution is chosen for and gamma distribution is chosen for

Page 11: Bayesian dynamic modeling of latent trait distributions

Sampling in the latent response model

Page 12: Bayesian dynamic modeling of latent trait distributions

Sampling in the measurement model

Page 13: Bayesian dynamic modeling of latent trait distributions

Inference on the latent response distribution

• Collecting draws from the conditional predictive distribution for a future subject in dose groups:

• After convergence, the samples of nh,nh+1 represent draws from the predictive density of the latent response in group h, and inferences can be based on comparing these densities between groups.

Page 14: Bayesian dynamic modeling of latent trait distributions

DNA damage study

• The study assessed the effect of oxidative stress on the frequency of DNA strand breaks using single-cell gel electrophoresis.

• 500 human lymphoblastoid cells drawn from an immortalized cell line were randomized to one of the the five dose groups (0, 5, 20, 50, or 100 micromoles H2O2).

• There are p=5 surrogate measures of DNA damage, including (1) % tail DNA, (2) tail extent divided by head extent, (3) extent tail moment, (4) Olive tail moment, and (5) tail extent.

Page 15: Bayesian dynamic modeling of latent trait distributions
Page 16: Bayesian dynamic modeling of latent trait distributions
Page 17: Bayesian dynamic modeling of latent trait distributions

Conclusion

• The author proposed a Bayesian semiparametric latent response model in which the latent variable density can shift dynamically across groups.

• Use linear regression model to infer the latent variables may fail in many applications while the measurement model proposed by the author is quite flexible.

• The DMDP should prove useful when interest focuses on clustering of observations within and across groups.