The mclust Package - CMU Statisticsbrian/724/week14/mclust.pdf · 2005-03-13 · The mclust Package...

The mclust PackageJanuary 18, 2005

Version 2.1-8

Author C. Fraley and A.E. Raftery, Dept. of Statistics, University of Washington.

Title Model-based cluster analysis

Description Model-based cluster analysis: the 2002 version of MCLUST

Depends R (>= 1.7.0)

License See http://www.stat.washington.edu/mclust/license.txt

Maintainer Ron Wehrens <R.Wehrens@science.ru.nl>

URL http://www.stat.washington.edu/mclust

R topics documented:

Defaults.Mclust . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2EMclust . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4EMclustN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6Mclust . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8bic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9bicE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11bicEMtrain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12cdens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13cdensE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16chevron . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18clPairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18classError . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .20compareClass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21coordProj . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .22cv1EMtrain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24decomp2sigma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .25dens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .26density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .28diabetes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .29em . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .30emE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .32estep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35estepE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .37grid1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .39

2 Defaults.Mclust

hc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .40hcE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .42hclass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .43hypvol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .44lansing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .45map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .46mapClass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .47mclust-internal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .47mclust1Dplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .48mclust2Dplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .49mclustDA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .51mclustDAtest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .53mclustDAtrain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .54mclustOptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .56me . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .58meE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .60mstep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .62mstepE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .64mvn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .65mvnX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .67partconv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .68partuniq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .69plot.Mclust . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .69plot.mclustDA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .70randProj . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .72sigma2decomp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .74sim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .75simE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .77spinProj . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .79summary.EMclust . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .81summary.EMclustN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .82summary.Mclust . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .83summary.mclustDAtest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .84summary.mclustDAtrain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .85surfacePlot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .86uncerPlot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .88unmap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .89

Index 91

Defaults.Mclust List of values controlling defaults for some MCLUST functions.

Description

A named list of values including tolerances for singularity and convergence assessment, and anenumeration of models used as defaults in MCLUST functions.

Details

A functionmclustOptions is supplied for assigning values to the.Mclust list.

Defaults.Mclust 3

A list with the following components:

eps A scalar tolerance for deciding when to terminate computations due to com-putational singularity in covariances. Smaller values ofeps allow computa-tions to proceed nearer to singularity. The default is the relative machine pre-cision .Machine$double.eps , which is approximately $2e-16$ on IEEE-compliant machines.

tol A vector of length two giving relative convergence tolerances for the loglikeli-hood and for parameter convergence in the inner loop for models with iterativeM-step ("VEI", "VEE", "VVE", "VEV"), respectively. The default isc(1.e-5,1.e-5) .

itmax A vector of length two giving integer limits on the number of EM iterations andon the number of iterations in the inner loop for models with iterative M-step("VEI", "VEE", "VVE", "VEV"), respectively. The default isc(Inf,Inf)allowing termination to be completely governed bytol .

equalPro Logical variable indicating whether or not the mixing proportions are equal inthe model. Default:equalPro = FALSE .

warnSingular A logical value indicating whether or not a warning should be issued whenevera singularity is encountered. Default:warnSingular = TRUE .

emModelNames A vector of character strings indicating the models to be used for multivari-ate data in the functions such asEMclust and mclustDAtrain that in-volve multiple models. The default is all of the multivariate models availablein MCLUST:

"EII": spherical, equal volume"VII": spherical, unequal volume"EEI": diagonal, equal volume and shape"VEI": diagonal, varying volume, equal shape"EVI": diagonal, equal volume, varying shape"VVI": diagonal, varying volume and shape"EEE": ellipsoidal, equal volume, shape, and orientation"EEV": ellipsoidal, equal volume and equal shape"VEV": ellipsoidal, equal shape"VVV": ellipsoidal, varying volume, shape, and orientation

hcModelName A vector of two character strings giving the name of the model to be used in thehierarchical clustering phase for univariate and multivariate data, respectively,in EMclust andEMclustN . The default isc("V","VVV") , giving the un-constrained model in each case.

symbols A vector whose entries are either integers corresponding to graphics symbols orsingle characters for plotting for classifications. Classes are assigned symbolsin the given order.

References

C. Fraley and A. E. Raftery (2002a). Model-based clustering, discriminant analysis, and den-sity estimation. Journal of the American Statistical Association. Seehttp://www.stat.washington.edu/tech.reports (No. 380, 2000).

C. Fraley and A. E. Raftery (2002b). MCLUST:Software for model-based clustering, density esti-mation and discriminant analysis. Technical Report, Department of Statistics, University of Wash-ington. Seehttp://www.stat.washington.edu/tech.reports .

4 EMclust

See Also

mclustOptions , EMclust , mclustDAtrain , em, me, estep , mstep

Examples

n <- 250 ## create artificial dataset.seed(0)x <- rbind(matrix(rnorm(n*2), n, 2) %*% diag(c(1,9)),

matrix(rnorm(n*2), n, 2) %*% diag(c(1,9))[,2:1])xclass <- c(rep(1,n),rep(2,n))odd <- seq(1, 2*n, 2)train <- mclustDAtrain(x[odd, ], labels = xclass[odd]) ## training stepeven <- odd + 1test <- mclustDAtest(x[even, ], train) ## compute model densities

data(iris)irisMatrix <- iris[,1:4]irisClass <- iris[,5]

.Mclust

.Mclust <- mclustOptions(tol = 1.e-6, emModelNames = c("VII", "VVI", "VVV"))

.MclustirisBic <- EMclust(irisMatrix)summary(irisBic, irisMatrix).Mclust <- mclustOptions() # restore defaults.Mclust

EMclust BIC for Model-Based Clustering

Description

BIC for EM initialized by hierarchical clustering for parameterized Gaussian mixture models.

EMclust(data, G, emModelNames, hcPairs, subset, eps, tol, itmax, equalPro,warnSingular, ...)

Arguments

data A numeric vector, matrix, or data frame of observations. Categorical variablesare not allowed. If a matrix or data frame, rows correspond to observations andcolumns correspond to variables.

G An integer vector specifying the numbers of mixture components (clusters) forwhich the BIC is to be calculated. The default is1:9 .

emModelNames A vector of character strings indicating the models to be fitted in the EM phaseof clustering. Possible models:

"E" for spherical, equal variance (one-dimensional)"V" for spherical, variable variance (one-dimensional)"EII": spherical, equal volume

EMclust 5

"VII": spherical, unequal volume"EEI": diagonal, equal volume, equal shape"VEI": diagonal, varying volume, equal shape"EVI": diagonal, equal volume, varying shape"VVI": diagonal, varying volume, varying shape"EEE": ellipsoidal, equal volume, shape, and orientation"EEV": ellipsoidal, equal volume and equal shape"VEV": ellipsoidal, equal shape"VVV": ellipsoidal, varying volume, shape, and orientation

The default is.Mclust$emModelNames .

hcPairs A matrix of merge pairs for hierarchical clustering such as produced by func-tion hc . The default is to compute a hierarchical clustering tree by applyingfunctionhc with modelName = .Mclust$hcModelName[1] to univari-ate data andmodelName = .Mclust$hcModelName[2] to multivariatedata or a subset as indicated by thesubset argument. The hierarchical clus-tering results are used as starting values for EM.

subset A logical or numeric vector specifying the indices of a subset of the data to beused in the initial hierarchical clustering phase.

eps A scalar tolerance for deciding when to terminate computations due to compu-tational singularity in covariances. Smaller values ofeps allow computationsto proceed nearer to singularity. The default is.Mclust$eps .

tol A scalar tolerance for relative convergence of the loglikelihood. The default is.Mclust$tol .

itmax An integer limit on the number of EM iterations. The default is.Mclust$itmax .

equalPro Logical variable indicating whether or not the mixing proportions are equal inthe model. The default is.Mclust$equalPro .

warnSingular A logical value indicating whether or not a warning should be issued whenevera singularity is encountered. The default iswarnSingular=FALSE .

... Provided to allow lists with elements other than the arguments can be passed inindirect or list calls withdo.call .

Bayesian Information Criterion for the specified mixture models numbers of clusters. Auxiliaryinformation returned as attributes.

References

C. Fraley and A. E. Raftery (2002a). Model-based clustering, discriminant analysis, and density es-timation.Journal of the American Statistical Association 97:611:631. Seehttp://www.stat.washington.edu/mclust .

C. Fraley and A. E. Raftery (2002b). MCLUST:Software for model-based clustering, density esti-mation and discriminant analysis. Technical Report, Department of Statistics, University of Wash-ington. Seehttp://www.stat.washington.edu/mclust .

See Also

summary.EMclust , EMclustN , hc , me, mclustOptions

6 EMclustN

Examples

data(iris)irisMatrix <- as.matrix(iris[,1:4])

irisBic <- EMclust(irisMatrix)irisBicplot(irisBic)

irisBic <- EMclust(irisMatrix, subset = sample(1:nrow(irisMatrix), 100))irisBicplot(irisBic)

EMclustN BIC for Model-Based Clustering with Poisson Noise

Description

BIC for EM initialized by hierarchical clustering for parameterized Gaussian mixture models withPoisson noise.

EMclustN(data, G, emModelNames, noise, hcPairs, eps, tol, itmax,equalPro, warnSingular=FALSE, Vinv, ...)

Arguments

G An integer vector specifying the numbers of MVN (Gaussian) mixture compo-nents (clusters) for which the BIC is to be calculated. The default is0:9 where0 indicates only a noise component.

emModelNames A vector of character strings indicating the models to be fitted in the EM phaseof clustering. Possible models:

"E" for spherical, equal variance (one-dimensional)"V" for spherical, variable variance (one-dimensional)"EII": spherical, equal volume"VII": spherical, unequal volume"EEI": diagonal, equal volume, equal shape"VEI": diagonal, varying volume, equal shape"EVI": diagonal, equal volume, varying shape"VVI": diagonal, varying volume, varying shape"EEE": ellipsoidal, equal volume, shape, and orientation"EEV": ellipsoidal, equal volume and equal shape"VEV": ellipsoidal, equal shape"VVV": ellipsoidal, varying volume, shape, and orientation

EMclustN 7

noise A logical or numeric vector indicating whether or not observations are initiallyestimated to noise in the data. If there is no noiseEMclust should be use ratherthanEMclustN .

hcPairs A matrix of merge pairs for hierarchical clustering such as produced by func-tion hc . The default is to compute a hierarchical clustering tree by applyingfunctionhc with modelName = .Mclust$hcModelName[1] to univari-ate data andmodelName = .Mclust$hcModelName[2] to multivariatedata or a subset as indicated by thesubset argument. The hierarchical clus-tering results are used as starting values for EM.

Vinv An estimate of the reciprocal hypervolume of the data region. The default isdetermined by applying functionhypvol to the data.

Bayesian Information Criterion for the specified mixture models numbers of clusters. Auxiliaryinformation returned as attributes.

References

C. Fraley and A. E. Raftery (2002a). Model-based clustering, discriminant analysis, and density es-timation.Journal of the American Statistical Association 97:611-631. Seehttp://www.stat.washington.edu/mclust .

See Also

summary.EMclustN , EMclust , hc , me, mclustOptions

Examples

data(iris)irisMatrix <- as.matrix(iris[,1:4])irisClass <- iris[,5]

b <- apply( irisMatrix, 2, range)n <- 450set.seed(0)

8 Mclust

poissonNoise <- apply(b, 2, function(x, n=n)runif(n, min = x[1]-0.1, max = x[2]+.1), n = n)

set.seed(0)noiseInit <- sample(c(TRUE,FALSE),size=150+450,replace=TRUE,prob=c(3,1))Bic <- EMclustN(data=rbind(irisMatrix, poissonNoise), noise = noiseInit)Bicplot(Bic)

Mclust Model-Based Clustering

Description

Clustering via EM initialized by hierarchical clustering for parameterized Gaussian mixture models.The number of clusters and the clustering model is chosen to maximize the BIC.

Mclust(data, minG, maxG)

Arguments

minG An integer vector specifying the minimum number of mixture components (clus-ters) to be considered. The default is1 component.

maxG An integer vector specifying the maximum number of mixture components (clus-ters) to be considered. The default is9 components.

A list representing the best model (according to BIC) for the given range of numbers of clusters.The following components are included:

BIC A matrix giving the BIC value for each model (rows) and number of clusters(columns).

bic A scalar giving the optimal BIC value.

modelName The MCLUST name for the best model according to BIC.classification

The classification corresponding to the optimal BIC value.

uncertainty The uncertainty in the classification corresponding to the optimal BIC value.

mu For multidimensional models, a matrix whose columns are the means of eachgroup in the best model. For one-dimensional models, a vector whose entriesare the means for each group in the best model.

sigma For multidimensional models, a three dimensional array in whichsigma[,,k]gives the covariance for thekth group in the best model. For one-dimensionalmodels, either a scalar giving a common variance for the groups or a vectorwhose entries are the variances for each group in the best model.

pro The mixing probabilities for each component in the best model.

z A matrix whose[i,k] th entry is the probability that observationi belongs to thekcomponent in the model. The optimal classification is derived from this, chosingthe class to be the one giving the maximum probability.

loglik The log likelihood for the data under the best model.

Details

The following models are compared inMclust :

"E" for spherical, equal variance (one-dimensional)"V" for spherical, variable variance (one-dimensional)

"EII": spherical, equal volume"VII": spherical, unequal volume"EEI": diagonal, equal volume, equal shape"VVI": diagonal, varying volume, varying shape"EEE": ellipsoidal, equal volume, shape, and orientation"VVV": ellipsoidal, varying volume, shape, and orientation

Mclust is intended to combineEMclust and itssummary in a simiplified one-step model-basedclustering function. The latter provide more flexibility including choice of models.

References

See Also

plot.Mclust , EMclust

Examples

data(iris)irisMatrix <- as.matrix(iris[,1:4])irisClass <- iris[,5]irisMclust <- Mclust(irisMatrix)

## Not run: plot(irisMclust,irisMatrix)

bic BIC for Parameterized MVN Mixture Models

Description

Compute the BIC (Bayesian Information Criterion) for parameterized mixture models given theloglikelihood, the dimension of the data, and number of mixture components in the model.

10 bic

bic(modelName, loglik, n, d, G, ...)

Arguments

modelName A character string indicating the model. Possible models:

"E" for spherical, equal variance (one-dimensional)"V" for spherical, variable variance (one-dimensional)"EII": spherical, equal volume"VII": spherical, unequal volume"EEI": diagonal, equal volume, equal shape"VEI": diagonal, varying volume, equal shape"EVI": diagonal, equal volume, varying shape"VVI": diagonal, varying volume, varying shape"EEE": ellipsoidal, equal volume, shape, and orientation"EEV": ellipsoidal, equal volume and equal shape"VEV": ellipsoidal, equal shape"VVV": ellipsoidal, varying volume, shape, and orientation

loglik The loglikelihood for a data set with respect to the MVN mixture model speci-fied in themodelName argument.

n The number of observations in the data use to computeloglik .

d The dimension of the data used to computeloglik .

G The number of components in the MVN mixture model used to computeloglik .

... Arguments for diagonal-specific methods, in particular

equalPro A logical variable indicating whether or not the components in themodel are assumed to be present in equal proportion. The default is.Mclust$equalPro .

noise A logical variable indicating whether or not the model includes and op-tional Poisson noise component. The default is to assume that the modeldoes not include a noise component.

The BIC or Bayesian Information Criterion for the given input arguments.

References

See Also

bicE , . . . ,bicVVV , EMclust , estep , mclustOptions , do.call .

bicE 11

Examples

n <- nrow(irisMatrix)d <- ncol(irisMatrix)G <- 3

emEst <- me(modelName="VVI", data=irisMatrix, unmap(irisClass))names(emEst)

args(bic)bic(modelName="VVI",loglik=emEst$loglik,n=n,d=d,G=G)## Not run: do.call("bic", emEst) ## alternative call

bicE BIC for a Parameterized MVN Mixture Model

Description

Compute the BIC (Bayesian Information Criterion) for a parameterized mixture model given theloglikelihood, the dimension of the data, and number of mixture components in the model.

bicE(loglik, n, G, equalPro, noise = FALSE, ...)bicV(loglik, n, G, equalPro, noise = FALSE, ...)bicEII(loglik, n, d, G, equalPro, noise = FALSE, ...)bicVII(loglik, n, d, G, equalPro, noise = FALSE, ...)bicEEI(loglik, n, d, G, equalPro, noise = FALSE, ...)bicVEI(loglik, n, d, G, equalPro, noise = FALSE, ...)bicEVI(loglik, n, d, G, equalPro, noise = FALSE, ...)bicVVI(loglik, n, d, G, equalPro, noise = FALSE, ...)bicEEE(loglik, n, d, G, equalPro, noise = FALSE, ...)bicEEV(loglik, n, d, G, equalPro, noise = FALSE, ...)bicVEV(loglik, n, d, G, equalPro, noise = FALSE, ...)bicVVV(loglik, n, d, G, equalPro, noise = FALSE, ...)

Arguments

loglik The loglikelihood for a data set with respect to the MVN mixture model.

n The number of observations in the data used to computeloglik .

d The dimension of the data used to computeloglik .

G The number of components in the MVN mixture model used to computeloglik .

equalPro A logical variable indicating whether or not the components in the model are as-sumed to be present in equal proportion. The default is.Mclust$equalPro .

noise A logical variable indicating whether or not the model includes and optionalPoisson noise component. The default is to assume that the model does notinclude a noise component.

... Catch unused arguments from ado.call call.

12 bicEMtrain

The BIC or Bayesian Information Criterion for the MVN mixture model and data set correspondingto the input arguments.

References

See Also

bic , EMclust , estepE , mclustOptions , do.call

Examples

## To run an example, see man page for bic## Not run:data(iris)irisMatrix <- as.matrix(iris[,1:4])irisClass <- iris[,5]

n <- nrow(irisMatrix)d <- ncol(irisMatrix)G <- 3

emEst <- meVVI(data=irisMatrix, unmap(irisClass))names(emEst)

bicVVI(loglik=emEst$loglik, n=n, d=d, G=G)do.call("bicVVI", emEst) ## alternative call## End(Not run)

bicEMtrain Select models in discriminant analysis using BIC

Description

For the ten available discriminant models the BIC is calulated. The models for one-dimensionaldata are "E" and "V"; for higher dimensions they are "EII", "VII", "EEI", "VEI", "EVI", "VVI","EEE", "EEV", "VEV" and "VVV". This function is much faster thancv1EMtrain .

bicEMtrain(data, labels, modelNames)

cdens 13

Arguments

data A data matrix

labels Labels for each row in the data matrix

modelNames Vector of model names that should be tested.

Returns a vector where each element is the BIC for the corresponding model.

Author(s)

C. Fraley

See Also

cv1EMtrain

Examples

data(lansing)odd <- seq(from=1, to=nrow(lansing), by=2)round(bicEMtrain(lansing[odd,-3], labels=lansing[odd, 3]), 1)

cdens Component Density for Parameterized MVN Mixture Models

Description

Computes component densities for observations in parameterized MVN mixture models.

cdens(modelName, data, mu, ...)

Arguments

"EII": spherical, equal volume"VII": spherical, unequal volume"EEI": diagonal, equal volume, equal shape"VEI": diagonal, varying volume, equal shape"EVI": diagonal, equal volume, varying shape"VVI": diagonal, varying volume, varying shape"EEE": ellipsoidal, equal volume, shape, and orientation"EEV": ellipsoidal, equal volume and equal shape"VEV": ellipsoidal, equal shape"VVV": ellipsoidal, varying volume, shape, and orientation

14 cdens

For fitting a single Gaussian:

"X": one-dimensional"XII": spherical"XXI": diagonal"XXX": ellipsoidal

mu The mean for each component. If there is more than one component,mu is amatrix whose columns are the means of the components.

... Arguments for model-specific functions. Specifically:

• logarithm : A logical value indicating whether or not the logarithm ofthe component densities should be returned. The default is to return thecomponent densities, obtained from the log component densities by expo-nentiation.

• An argument describing the variance (depends on the model):

sigmasq for the one-dimensional models ("E", "V") and spherical models("EII", "VII"). This is either a vector whosekth component is the vari-ance for thekth component in the mixture model ("V" and "VII"), ora scalar giving the common variance for all components in the mixturemodel ("E" and "EII").

decomp for the diagonal models ("EEI", "VEI", "EVI", "VVI") and someellipsoidal models ("EEV", "VEV"). This is a list with the followingcomponents:

d The dimension of the data.

G The number of components in the mixture model.

scale Either aG-vector giving the scale of the covariance (thedth rootof its determinant) for each component in the mixture model, or asingle numeric value if the scale is the same for each component.

shape Either aG by d matrix in which thekth column is the shapeof the covariance matrix (normalized to have determinant 1) for thekth component, or ad-vector giving a common shape for all compo-nents.

orientation Either ad by d by G array whose[,,k] th entry is the or-thonomal matrix of eigenvectors of the covariance matrix of thekthcomponent, or ad by d orthonormal matrix if the mixture compo-nents have a common orientation. Theorientation componentof decomp can be omitted in spherical and diagonal models, forwhich the principal components are parallel to the coordinate axesso that the orientation matrix is the identity.

Sigma for the equal variance model "EEE". Ad by d matrix giving thecommon covariance for all components of the mixture model.

sigma for the unconstrained variance model "VVV". Ad by d by G ma-trix array whose[,,k] th entry is the covariance matrix for thekthcomponent of the mixture model.The form of the variance specification is the same as for the output fortheem, me, or mstep methods for the specified mixture model.

cdens 15

• eps : A scalar tolerance for deciding when to terminate computations dueto computational singularity in covariances. Smaller values ofeps allowcomputations to proceed nearer to singularity. The default is.Mclust$eps .For those models with iterative M-step ("VEI", "VEV"), two values can beentered foreps , in which case the second value is used for determiningsingularity in the M-step.

• warnSingular : A logical value indicating whether or not a warningshould be issued whenever a singularity is encountered. The default is.Mclust$warnSingular .

A numeric matrix whose[i,j] th entry is the density of observationi in componentj. The densitiesare not scaled by mixing proportions.

References

See Also

cdensE , . . . ,cdensVVV , dens , EMclust , mstep , mclustDAtrain , mclustDAtest , mclustOptions ,do.call

Examples

n <- 100 ## create artificial data

set.seed(0)x <- rbind(matrix(rnorm(n*2), n, 2) %*% diag(c(1,9)),

matrix(rnorm(n*2), n, 2) %*% diag(c(1,9))[,2:1])xclass <- c(rep(1,n),rep(2,n))clPairs(x, cl = xclass, sym = c("1","2")) ## display the data

set.seed(0)I <- sample(1:(2*n)) ## random ordering of the datax <- x[I, ]xclass <- xclass[I]

odd <- seq(1, 2*n, by = 2)oddBic <- EMclust(x[odd, ])oddSumry <- summary(oddBic, x[odd, ]) ## best parameter estimatesnames(oddSumry)

even <- odd + 1temp <- cdens(modelName = oddSumry$modelName, data = x[even, ],

mu = oddSumry$mu, decomp = oddSumry$decomp)cbind(class = xclass[even], temp)

## alternative call

16 cdensE

## Not run:temp <- do.call( "cdens", c(list(data = x[even, ]), oddSumry))cbind(class = xclass[even], temp)## End(Not run)

cdensE Component Density for a Parameterized MVN Mixture Model

Description

Computes component densities for points in a parameterized MVN mixture model.

cdensE(data, mu, sigmasq, eps, warnSingular, logarithm = FALSE, ...)cdensV(data, mu, sigmasq, eps, warnSingular, logarithm = FALSE, ...)cdensEII(data, mu, sigmasq, eps, warnSingular, logarithm = FALSE, ...)cdensVII(data, mu, sigmasq, eps, warnSingular, logarithm = FALSE, ...)cdensEEI(data, mu, decomp, eps, warnSingular, logarithm = FALSE, ...)cdensVEI(data, mu, decomp, eps, warnSingular, logarithm = FALSE, ...)cdensEVI(data, mu, decomp, eps, warnSingular, logarithm = FALSE, ...)cdensVVI(data, mu, decomp, eps, warnSingular, logarithm = FALSE, ...)cdensEEE(data, mu, eps, warnSingular, logarithm = FALSE, ...)cdensEEV(data, mu, decomp, eps, warnSingular, logarithm = FALSE, ...)cdensVEV(data, mu, decomp, eps, warnSingular, logarithm = FALSE, ...)cdensVVV(data, mu, eps, warnSingular, logarithm = FALSE, ...)

Arguments

sigmasq for the one-dimensional models ("E", "V") and spherical models ("EII", "VII").This is either a vector whosekth component is the variance for thekth com-ponent in the mixture model ("V" and "VII"), or a scalar giving the commonvariance for all components in the mixture model ("E" and "EII").

decomp for the diagonal models ("EEI", "VEI", "EVI", "VVI") and some ellipsoidalmodels ("EEV", "VEV"). This is a list described in more detail incdens .

logarithm A logical value indicating whether or not the logarithm of the component den-sities should be returned. The default is to return the component densities, ob-tained from the log component densities by exponentiation.

... An argument giving the variance that takes one of the following forms:

decomp for models "EII" and "VII"; see above.

cholSigma see Sigma, for "EEE".

Sigma for the equal variance model "EEE". Ad by d matrix giving the commoncovariance for all components of the mixture model.

cdensE 17

cholsigma see sigma, for "VVV".

sigma for the unconstrained variance model "VVV". Ad by d by G matrixarray whose[,,k] th entry is the covariance matrix for thekth componentof the mixture model.The form of the variance specification is the same as for the output for theem, me, or mstep methods for the specified mixture model.Also used to catch unused arguments from ado.call call.

warnSingular A logical value indicating whether or not a warning should be issued whenevera singularity is encountered. The default is.Mclust$warnSingular .

A numeric matrix whose[i,j] th entry is the density of observationi in componentj. The densitiesare not scaled by mixing proportions.

References

See Also

cdens , dens , EMclust , mstep , mclustOptions , do.call

Examples

modelVII <- meVII(x, z = unmap(xclass))modelVVI <- meVVI(x, z = unmap(xclass))modelVVV <- meVVV(x, z = unmap(xclass))

names(modelVII)args(cdensVII)cdenVII <- cdensVII(data = x, mu = modelVII$mu, pro = modelVII$pro,

decomp = modelVII$decomp)names(modelVVI)args(cdensVVI)cdenVVI <- cdensVII(data = x, mu = modelVVI$mu, pro = modelVVI$pro,

decomp = modelVVI$decomp)names(modelVVV)

18 clPairs

args(cdensVVV)cdenVVV <- cdensVVV( data = x, mu = modelVVV$mu, pro = modelVVV$pro,

cholsigma = modelVVV$cholsigma)

cbind(class=xclass,VII=map(cdenVII),VVI=map(cdenVVI),VVV=map(cdenVVV))

## alternative call

## Not run:cdenVII <- do.call("cdensVII", c(list(data = x), modelVII))cdenVVI <- do.call("cdensVVI", c(list(data = x), modelVVI))cdenVVV <- do.call("cdensVVV", c(list(data = x), modelVVV))

cbind(class=xclass,VII=map(cdenVII),VVI=map(cdenVVI),VVV=map(cdenVVV))## End(Not run)

chevron Simulated minefield data

Description

A two-dimensional data set of simulated minefield data (1104 observations).

data(chevron)

References

C. Fraley and A.E. Raftery,Computer J., 41:578-588 (1998)

clPairs Pairwise Scatter Plots showing Classification

Description

Creates a scatter plot for each pair of variables in given data. Observations in different classes arerepresented by different symbols.

clPairs(data, classification, symbols, labels=dimnames(data)[[2]],CEX=1, col, ...)

clPairs 19

Arguments

classificationA numeric or character vector representing a classification of observations (rows)of data .

symbols Either an integer or character vector assigning a plotting symbol to each uniqueclassclassification . Elements insymbols correspond to classes in or-der of appearance in the sequence of observations (the order used by the functionunique ). Default: If G is the number of groups in the classification, the firstG symbols in.Mclust$symbols , otherwise ifG is less than 27 then the firstG capital letters in the Roman alphabet. If noclassification argument isgiven the default symbol is"." .

labels A vector of character strings for labeling the variables. The default is to use thecolumn dimension names ofdata .

CEX An argument specifying the size of the plotting symbols. The default value is 1.

col Color vector to use. Default is one color per class. Splus default: all black.

... Additional arguments to be passed to the graphics device.

Side Effects

Scatter plots for each combination of variables indata are created on the current graphics device.Observations of different classifications are labeled with different symbols.

References

See Also

pairs , coordProj , mclustOptions

Examples

clPairs(irisMatrix, cl=irisClass, symbols=as.character(1:3))

20 classError

classError Classification error.

Description

Error for a given classification relative to a known truth. Location of errors in a given classificationrelative to a known truth.

classError(classification, truth)

Arguments

classificationA numeric or character vector of class labels.

truth A numeric or character vector of class labels. Must have the same length asclassification .

Details

classErrors will only return one possibility if more than one mapping between classificationand truth results in the minimum error.

classError gives the fraction of elements misclassified forclassification relative totruth . classErrors is a logical vector of the same length asclassification andtruthwhich gives the location of misclassified elements inclassification relative totruth .

See Also

compareClass , mapClass , table

Examples

a <- rep(1:3, 3)ab <- rep(c("A", "B", "C"), 3)bclassError(a, b)classErrors(a, b)

a <- sample(1:3, 9, replace = TRUE)ab <- sample(c("A", "B", "C"), 9, replace = TRUE)bclassError(a, b)

compareClass 21

compareClass Compare classifications.

Description

Compare classifications via the normalized variation of information criterion.

compareClass(a, b)

Arguments

a A numeric or character vector of class labels.

b A numeric or character vector of class labels. Must have the same length asa.

The variation of information criterion (Meila 2002) fora andb divided by the log of the length ofthe sequences so that it falls in[0,1].

References

Marina Meila (2002). Comparing clusterings. Technical Report no. 418, Department of Statistics,University of Washington.

Seehttp://www.stat.washington.edu/www/research/reports .

See Also

mapClass , classError , table

Examples

a <- rep(1:3, 3)ab <- rep(c("A", "B", "C"), 3)bcompareClass(a, b)a <- sample(1:3, 9, replace = TRUE)ab <- sample(c("A", "B", "C"), 9, replace = TRUE)bcompareClass(a, b)

22 coordProj

coordProj Coordinate projections of data in more than two dimensions modelledby an MVN mixture.

Description

Plots coordinate projections given data in more than two dimensions and parameters of an MVNmixture model for the data.

coordProj(data, ..., dimens = c(1, 2),type = c("classification","uncertainty","errors"), ask = TRUE,quantiles = c(0.75, 0.95), symbols, scale = FALSE,identify = FALSE, CEX = 1, PCH = ".", xlim, ylim)

Arguments

data A numeric matrix or data frame of observations. Categorical variables are notallowed. If a matrix or data frame, rows correspond to observations and columnscorrespond to variables.

dimens A vector of length 2 giving the integer dimensions of the desired coordinateprojections. The default isc(1,2) , in which the first dimension is plottedagainst the second.

... One or more of the following:

classification A numeric or character vector representing a classification of ob-servations (rows) ofdata .

uncertainty A numeric vector of values in(0,1)giving the uncertainty of eachdata point.

z A matrix in which the[i,k] th entry gives the probability of observationi belonging to thekth class. Used to computeclassification anduncertainty if those arguments aren’t available.

truth A numeric or character vector giving a known classification of each datapoint. If classification orz is also present, this is used for displayingclassification errors.

mu A matrix whose columns are the means of each group.

sigma A three dimensional array in whichsigma[,,k] gives the covariancefor thekth group.

decomp A list with scale , shape andorientation components givingan alternative form for the covariance structure of the mixture model.

type Any subset ofc("classification","uncertainty","errors") .The function will produce the corresponding plot if it has been supplied suf-ficient information to do so. If more than one plot is possible then users will beasked to choose from a menu ifask=TRUE.

ask A logical variable indicating whether or not a menu should be produced whenmore than one plot is possible. The default isask=TRUE.

coordProj 23

quantiles A vector of length 2 giving quantiles used in plotting uncertainty. The smallestsymbols correspond to the smallest quantile (lowest uncertainty), medium-sized(open) symbols to points falling between the given quantiles, and large (filled)symbols to those in the largest quantile (highest uncertainty). The default is(0.75,0.95).

symbols Either an integer or character vector assigning a plotting symbol to each uniqueclass inclassification . Elements insymbols correspond to classes inclassification in sorted order. Default: IfG is the number of groups inthe classification, the firstG symbols in.Mclust$symbols , otherwise ifGis less than 27 then the firstG capital letters in the Roman alphabet.

scale A logical variable indicating whether or not the two chosen dimensions shouldbe plotted on the same scale, and thus preserve the shape of the distribution.Default: scale=FALSE

identify A logical variable indicating whether or not to add a title to the plot identifyingthe dimensions used.

PCH An argument specifying the symbol to be used when a classificatiion has notbeen specified for the data. The default value is a small dot ".".

xlim, ylim Arguments specifying bounds for the ordinate, abscissa of the plot. This may beuseful for when comparing plots.

Side Effects

Coordinate projections of the data, possibly showing location of the mixture components, classifi-cation, uncertainty, and/or classification errors.

References

C. Fraley and A. E. Raftery (2002). Model-based clustering, discriminant analysis, and density es-timation.Journal of the American Statistical Association 97:611-631. Seehttp://www.stat.washington.edu/mclust .

C. Fraley and A. E. Raftery (2002). MCLUST:Software for model-based clustering, density esti-mation and discriminant analysis. Technical Report, Department of Statistics, University of Wash-ington. Seehttp://www.stat.washington.edu/mclust .

See Also

clPairs , randProj , mclust2Dplot , mclustOptions , do.call

Examples

msEst <- mstepVVV(irisMatrix, unmap(irisClass))

par(pty = "s", mfrow = c(1,2))coordProj(irisMatrix,dimens=c(2,3), truth = irisClass,

mu = msEst$mu, sigma = msEst$sigma, z = msEst$z)do.call("coordProj", c(list(data=irisMatrix, dimens=c(2,3), truth=irisClass),

msEst))

24 cv1EMtrain

cv1EMtrain Select discriminant models using cross validation

Description

For the ten available discriminant models the leave-one-out cross validation error is calulated. Themodels for one-dimensional data are "E" and "V"; for higher dimensions they are "EII", "VII","EEI", "VEI", "EVI", "VVI", "EEE", "EEV", "VEV" and "VVV".

cv1EMtrain(data, labels, modelNames)

Arguments

data A data matrix

labels Labels for each row in the data matrix

modelNames Vector of model names that should be tested.

Returns a vector where each element is the error rate for the corresponding model.

Author(s)

C. Fraley

See Also

bicEMtrain

Examples

data(lansing)odd <- seq(from=1, to=nrow(lansing), by=2)round(cv1EMtrain(data=lansing[odd,-3], labels=lansing[odd,3]), 3)

cv1Modd <- mstepEEV(data=lansing[odd,-3], z=unmap(lansing[odd,3]))cv1Zodd <- do.call("estepEEV", c(cv1Modd, list(data=lansing[odd,-3])))$zcompareClass(map(cv1Zodd), lansing[odd,3])

even <- (1:nrow(lansing))[-odd]cv1Zeven <- do.call("estepEEV", c(cv1Modd, list(data=lansing[even,-3])))$zcompareClass(map(cv1Zodd), lansing[odd,3])$error

decomp2sigma 25

decomp2sigma Convert mixture component covariances to matrix form.

Description

Converts a set of covariances from a parameterization by eigenvalue decomposition to representa-tion as a 3-D array.

decomp2sigma(d, G, scale, shape, orientation, ...)

Arguments

scale Either aG-vector giving the scale of the covariance (thedth root of its determi-nant) for each component in the mixture model, or a single numeric value if thescale is the same for each component.

shape Either aG by d matrix in which thekth column is the shape of the covariancematrix (normalized to have determinant 1) for thekth component, or ad-vectorgiving a common shape for all components.

orientation Either ad by d by G array whose[,,k] th entry is the orthonomal matrix ofeigenvectors of the covariance matrix of thekth component, or ad by d or-thonormal matrix if the mixture components have a common orientation. Theorientation component ofdecomp can be omitted in spherical and diag-onal models, for which the principal components are parallel to the coordinateaxes so that the orientation matrix is the identity.

A 3-D array whose[,,k] th component is the covariance matrix of thekth component in an MVNmixture model.

References

C. Fraley and A. E. Raftery (2002b). MCLUST:Software for model-based clustering, density esti-mation, and discriminant analysis. Technical Report, Department of Statistics, University of Wash-ington. Seehttp://www.stat.washington.edu/mclust .

See Also

sigma2decomp

26 dens

Examples

meEst <- meVEV(irisMatrix, unmap(irisClass))names(meEst)meEst$decompmeEst$sigma

dec <- meEst$decompdecomp2sigma(d=dec$d, G=dec$G, shape=dec$shape, scale=dec$scale,

orientation = dec$orientation)## Not run:do.call("decomp2sigma", meEst$decomp) ## alternative call## End(Not run)

dens Density for Parameterized MVN Mixtures

Description

Computes densities of obseravations in parameterized MVN mixtures.

dens(modelName, data, mu, logarithm, ...)

Arguments

"EII": spherical, equal volume"VII": spherical, unequal volume"EEI": diagonal, equal volume, equal shape"VEI": diagonal, varying volume, equal shape"EVI": diagonal, equal volume, varying shape"VVI": diagonal, varying volume, varying shape"EEE": ellipsoidal, equal volume, shape, and orientation"EEV": ellipsoidal, equal volume and equal shape"VEV": ellipsoidal, equal shape"VVV": ellipsoidal, varying volume, shape, and orientation

For fitting a single Gaussian,

"X": one-dimensional"XII": spherical"XXI": diagonal"XXX": ellipsoidal

dens 27

logarithm Return logarithm of the density, rather than the density itself. Default: FALSE

... Other arguments, such as an argument describing the variance. Seecdens .

A numeric vector whoseith component is the density of observationi in the MVN mixture specifiedby muand... .

References

See Also

grid1 , cdens , mclustOptions , do.call

Examples

set.seed(0)I <- sample(1:(2*n))x <- x[I, ]xclass <- xclass[I]

odd <- seq(1, 2*n, by = 2)oddBic <- EMclust(x[odd, ])oddSumry <- summary(oddBic, x[odd, ]) ## best parameter estimatesnames(oddSumry)

oddDens <- dens(modelName = oddSumry$modelName, data = x,mu = oddSumry$mu, decomp = oddSumry$decomp, pro = oddSumry$pro)

## Not run:oddDens <- do.call("dens", c(list(data = x), oddSumry)) ## alternative call## End(Not run)

even <- odd + 1

28 density

evenBic <- EMclust(x[even, ])evenSumry <- summary(evenBic, x[even, ]) ## best parameter estimatesevenDens <- do.call( "dens", c(list(data = x), evenSumry))

cbind(class = xclass, odd = oddDens, even = evenDens)

density Kernel Density Estimation

Description

This is exaclty the same function as in the base package but for themethod argument: if it isgiven and equals"mclust" , themclust density estimation is used. Optionally, the number ofgaussians to be considered can be given as well (G).

density(..., method, G)

Arguments

... Arguments to thedensity function in the base package.

method If equal to "mclust",EMclust is used to estimate the density.

G The number of gaussians to consider in the model-based density estimation.Default: 1:9. Ignored if method is not equal to "mclust".

If give.Rkern is true, the numberR(K), otherwise an object with class"density" whoseunderlying structure is a list containing the following components.

x then coordinates of the points where the density is estimated.

y the estimated density values.

bw the bandwidth used.

N the sample size after elimination of missing values.

call the call which produced the result.

data.name the deparsed name of thex argument.

has.na logical, for compatibility (always FALSE).

References

Fraley, C. and Raftery, A.E. (2002) MCLUST: software for model-based clustering, density esti-mation and discriminant analysis. Technical Report No. 415, Dept. of Statistics, University ofWashington.

Scott, D. W. (1992)Multivariate Density Estimation. Theory, Practice and Visualization. NewYork: Wiley.

Sheather, S. J. and Jones M. C. (1991) A reliable data-based bandwidth selection method for kerneldensity estimation.J. Roy. Statist. Soc.B, 683–690.

Silverman, B. W. (1986)Density Estimation. London: Chapman and Hall.

Venables, W. N. and Ripley, B. D. (1999)Modern Applied Statistics with S-PLUS. New York:Springer.

diabetes 29

See Also

density (base package),bw.nrd , plot.density , hist .

Examples

plot(density(c(-20,rep(0,98),20)), xlim = c(-4,4))# IQR = 0

# The Old Faithful geyser datadata(faithful)d <- density(faithful$eruptions, bw = "sj")dplot(d)dmc <- density(faithful$eruptions, method="mclust")plot(dmc, type = "n")polygon(dmc, col = "wheat")lines(d, col="red")

## Missing values:x <- xx <- faithful$eruptionsx[i.out <- sample(length(x), 10)] <- NAdoRmc <- density(x=x, method="mclust", na.rm = TRUE)lines(doRmc, col="blue")doR <- density(x, bw = 0.15, na.rm = TRUE)lines(doR, col = "green")rug(x)points(xx[i.out], rep(0.01, 10))

## function formals returns something different now the original## density function is masked...base.density <- if(exists("density", envir = NULL)) {

get("density", envir = NULL)} else

stats::density(kernels <- eval(formals(base.density)$kernel))

## show the kernels in the R parametrizationplot (density(0, bw = 1), xlab = "",

main="R's density() kernels with bw = 1")for(i in 2:length(kernels))

lines(density(0, bw = 1, kern = kernels[i]), col = i)legend(1.5,.4, legend = kernels, col = seq(kernels),

lty = 1, cex = .8, y.int = 1)

data(precip)bw <- bw.SJ(precip) ## sensible automatic choiceplot(density(precip, bw = bw, n = 2^13))lines(density(precip, G=2:5, method="mclust"), col="red")rug(precip)

diabetes Diabetes data

Description

Diabetes data from Reaven and Miller. Number of objects: 145; 3 variables. Three classes.

data(diabetes)

References

G.M. Reaven and R.G. Miller,Diabetologica16:17-24 (1979).

em EM algorithm starting with E-step for parameterized MVN mixturemodels.

Description

Implements the EM algorithm for parameterized MVN mixture models, starting with the expecta-tion step.

em(modelName, data, mu, ...)

Arguments

modelName A character string indicating the model:

"E": equal variance (one-dimensional)"V": variable variance (one-dimensional)

... Arguments for model-specific em functions. Specifically:

decomp for the diagonal models ("EEI", "VEI", "EVI", "VVI") and someellipsoidal models ("EEV", "VEV"). For a description, seecdens .

• pro : Mixing proportions for the components of the mixture. There shouldone more mixing proportion than the number of MVN components if themixture model includes a Poisson noise term.

• eps : A scalar tolerance for deciding when to terminate computations dueto computational singularity in covariances. Smaller values ofeps allowcomputations to proceed nearer to singularity. The default is.Mclust$eps .For those models with iterative M-step ("VEI", "VEV"), two values can beentered foreps , in which case the second value is used for determiningsingularity in the M-step.

• tol : A scalar tolerance for relative convergence of the loglikelihood. Thedefault is.Mclust$tol .For those models with iterative M-step ("VEI", "VEV"), two values can beentered fortol , in which case the second value governs parameter conver-gence in the M-step.

• itmax : An integer limit on the number of EM iterations. The default is.Mclust$itmax .For those models with iterative M-step ("VEI", "VEV"), two values can beentered foritmax , in which case the second value is an upper limit on thenumber of iterations in the M-step.

• equalPro : Logical variable indicating whether or not the mixing propor-tions are equal in the model. The default is.Mclust$equalPro .

• warnSingular : A logical value indicating whether or not a warningshould be issued whenever a singularity is encountered. The default is.Mclust$warnSingular .

• Vinv : An estimate of the reciprocal hypervolume of the data region. Thedefault is determined by applying functionhypvol to the data. Used onlywhenpro includes an additional mixing proportion for a noise component.

Details

This function can be used with an indirect or list call usingdo.call , allowing the output of e.g.mstep to be passed without the need to specify individual parameters as arguments.

A list including the following components:

z A matrix whose[i,k] th entry is the conditional probability of theith observa-tion belonging to thekth component of the mixture.

loglik The logliklihood for the data in the mixture model.

mu A matrix whose kth column is the mean of thekth component of the mixturemodel.

32 emE

sigma For multidimensional models, a three dimensional array in which the[,,k] thentry gives the the covariance for thekth group in the best model. <br> For one-dimensional models, either a scalar giving a common variance for the groups ora vector whose entries are the variances for each group in the best model.

pro A vector whosekth component is the mixing proportion for thekth componentof the mixture model.

modelName A character string identifying the model (same as the input argument).

Attributes: • "info" : Information on the iteration.

• "warn" : An appropriate warning if problems are encountered in the com-putations.

References

See Also

emE, . . . ,emVVV, estep , me, mstep , mclustOptions , do.call

Examples

msEst <- mstep(modelName = "EEE", data = irisMatrix,z = unmap(irisClass))

names(msEst)

em(modelName = msEst$modelName, data = irisMatrix,mu = msEst$mu, Sigma = msEst$Sigma, pro = msEst$pro)

## Not run:do.call("em", c(list(data = irisMatrix), msEst)) ## alternative call## End(Not run)

emE EM algorithm starting with E-step for a parameterized MVN mixturemodel.

Description

Implements the EM algorithm for a parameterized MVN mixture model, starting with the expecta-tion step.

emE 33

emE(data, mu, sigmasq, pro, eps, tol, itmax, equalPro, warnSingular,Vinv, ...)

emV(data, mu, sigmasq, pro, eps, tol, itmax, equalPro, warnSingular,Vinv, ...)

emEII(data, mu, sigmasq, pro, eps, tol, itmax, equalPro, warnSingular,Vinv, ...)

emVII(data, mu, sigmasq, pro, eps, tol, itmax, equalPro, warnSingular,Vinv, ...)

emEEI(data, mu, decomp, pro, eps, tol, itmax, equalPro, warnSingular,Vinv, ...)

emVEI(data, mu, decomp, pro, eps, tol, itmax, equalPro, warnSingular,Vinv, ...)

emEVI(data, mu, decomp, pro, eps, tol, itmax, equalPro, warnSingular,Vinv, ...)

emVVI(data, mu, decomp, pro, eps, tol, itmax, equalPro, warnSingular,Vinv, ...)

emEEE(data, mu, Sigma, pro, eps, tol, itmax, equalPro, warnSingular,Vinv, ...)

emEEV(data, mu, decomp, pro, eps, tol, itmax, equalPro, warnSingular,Vinv, ...)

emVEV(data, mu, decomp, pro, eps, tol, itmax, equalPro, warnSingular,Vinv, ...)

emVVV(data, mu, sigma, pro, eps, tol, itmax, equalPro, warnSingular,Vinv, ...)

Arguments

Sigma for the equal variance model "EEE". Ad by d matrix giving the common co-variance for all components of the mixture model.

sigma for the unconstrained variance model "VVV". Ad by d by G matrix array whose[,,k] th entry is the covariance matrix for thekth component of the mixturemodel.

... An argument giving the variance that takes one of the following forms:

decomp for models "VVV", "EII" and "VII"; seecdens .

cholSigma see Sigma, for "EEE".

cholsigma see sigma, for "VVV".

sigma see sigma, for "VVV".

34 emE

Sigma see Sigma, for "EEE".The form of the variance specification is the same as for the output for theem, me, or mstep methods for the specified mixture model.Also used to catch unused arguments from ado.call call.

pro Mixing proportions for the components of the mixture. There should one moremixing proportion than the number of MVN components if the mixture modelincludes a Poisson noise term.

tol A scalar tolerance for relative convergence of the loglikelihood values. Thedefault is.Mclust$tol .

equalPro A logical value indicating whether or not the components in the model arepresent in equal proportions. The default is.Mclust$equalPro .

Vinv An estimate of the reciprocal hypervolume of the data region. The default isdetermined by applying functionhypvol to the data. Used only whenproincludes an additional mixing proportion for a noise component.

Details

modelName Character string identifying the model.

Attributes: • "info" : Information on the iteration.

• "warn" : An appropriate warning if problems are encountered in the com-putations.

estep 35

References

See Also

em, mstep , mclustOptions , do.call

Examples

msEst <- mstepEEE(data = irisMatrix, z = unmap(irisClass))names(msEst)

emEEE(data = irisMatrix, mu = msEst$mu, pro = msEst$pro,cholSigma = msEst$cholSigma)## Not run:do.call("emEEE", c(list(data=irisMatrix), msEst)) ## alternative call## End(Not run)

estep E-step for parameterized MVN mixture models.

Description

Implements the expectation step of EM algorithm for parameterized MVN mixture models.

estep(modelName, data, mu, ...)

Arguments

"EII": spherical, equal volume"VII": spherical, unequal volume"EEI": diagonal, equal volume and shape"VEI": diagonal, varying volume, equal shape"EVI": diagonal, equal volume, varying shape"VVI": diagonal, varying volume and shape"EEE": ellipsoidal, equal volume, shape, and orientation

36 estep

"EEV": ellipsoidal, equal volume and equal shape"VEV": ellipsoidal, equal shape"VVV": ellipsoidal, varying volume, shape, and orientation

• An argument describing the variance (depends on the model):sigmasq for the one-dimensional models ("E", "V") and spherical models

("EII", "VII"). This is either a vector whosekth component is the vari-ance for thekth component in the mixture model ("V" and "VII"), ora scalar giving the common variance for all components in the mixturemodel ("E" and "EII").

decomp for the diagonal models ("EEI", "VEI", "EVI", "VVI") and someellipsoidal models ("EEV", "VEV"). This is a list described incdens .

pro Mixing proportions for the components of the mixture. There should onemore mixing proportion than the number of MVN components if the mix-ture model includes a Poisson noise term.

eps A scalar tolerance for deciding when to terminate computations due to com-putational singularity in covariances. Smaller values ofeps allow compu-tations to proceed nearer to singularity. The default is.Mclust$eps .

warnSingularA logical value indicating whether or not a warning should be issued when-ever a singularity is encountered. The default is.Mclust$warnSingular .

Vinv An estimate of the reciprocal hypervolume of the data region. The defaultis determined by applying functionhypvol to the data. Used only whenpro includes an additional mixing proportion for a noise component.

Details

Attribute • "warn" : An appropriate warning if problems are encountered in the com-putations.

estepE 37

References

See Also

estepE , ...,estepVVV , em, mstep , do.call , mclustOptions

Examples

msEst <- mstep(modelName = "EII", data = irisMatrix,z = unmap(irisClass))

names(msEst)

estep(modelName = msEst$modelName, data = irisMatrix,mu = msEst$mu, sigmasq = msEst$sigmasq, pro = msEst$pro)

## Not run:do.call("estep", c(list(data = irisMatrix), msEst)) ## alternative call## End(Not run)

estepE E-step in the EM algorithm for a parameterized MVN mixture model.

Description

Implements the expectation step in the EM algorithm for a parameterized MVN mixture model.

estepE(data, mu, sigmasq, pro, eps, warnSingular, Vinv, ...)estepV(data, mu, sigmasq, pro, eps, warnSingular, Vinv, ...)estepEII(data, mu, sigmasq, pro, eps, warnSingular, Vinv, ...)estepVII(data, mu, sigmasq, pro, eps, warnSingular, Vinv, ...)estepEEI(data, mu, decomp, pro, eps, warnSingular, Vinv, ...)estepVEI(data, mu, decomp, pro, eps, warnSingular, Vinv, ...)estepEVI(data, mu, decomp, pro, eps, warnSingular, Vinv, ...)estepVVI(data, mu, decomp, pro, eps, warnSingular, Vinv, ...)estepEEE(data, mu, Sigma, pro, eps, warnSingular, Vinv, ...)estepEEV(data, mu, decomp, pro, eps, warnSingular, Vinv, ...)estepVEV(data, mu, decomp, pro, eps, warnSingular, Vinv, ...)estepVVV(data, mu, sigma, pro, eps, warnSingular, Vinv, ...)

38 estepE

Arguments

sigma for the unconstrained variance model "VVV" or the equal variance model "EEE".A d by d by G matrix array whose[,,k] th entry is the covariance matrix forthekth component of the mixture model.

Sigma for the equal variance model "EEE". Ad by d matrix giving the common co-variance for all components of the mixture model.

pro Mixing proportions for the components of the mixture. There should one moremixing proportion than the number of MVN components if the mixture modelincludes a Poisson noise term.

Vinv An estimate of the reciprocal hypervolume of the data region. The default isdetermined by applying functionhypvol to the data. Used only whenproincludes an additional mixing proportion for a noise component.

... Other arguments to describe the variance, in particulardecomp, sigma orcholsigma for model "VVV", decomp for models "VII" and "EII", andSigma or cholSigma for model "EEE". Sigma is and by d matrix givingthe common covariance for all components of the mixture model.Also used to catch unused arguments from ado.call call.

Details

Attribute • "warn" : An appropriate warning if problems are encountered in the com-putations.

grid1 39

References

C. Fraley and A. E. Raftery (2002a). Model-based clustering, discriminant analysis, and den-sity estimation. Journal of the American Statistical Association. Seehttp://www.stat.washington.edu/mclust .

See Also

estep , em, mstep , do.call , mclustOptions

Examples

msEst <- mstepEII(data = irisMatrix, z = unmap(irisClass))names(msEst)

estepEII(data = irisMatrix, mu = msEst$mu, pro = msEst$pro,sigmasq = msEst$sigmasq)

## Not run:do.call("estepEII", c(list(data=irisMatrix), msEst)) ## alternative call## End(Not run)

grid1 Generate grid points

Description

Generate grid points in one or two dimensions.

grid1(n, range = c(0, 1), edge = TRUE)grid2(x, y)

Arguments

n Number of grid points.

range Range of grid points.

edge Logical: include edges or not?

x, y Vectors.

The value returned is simple:grid1 generates a vector;grid2 generates a matrix.

Author(s)

C. Fraley

See Also

lansing , dens

Examples

data(lansing)maples <- lansing[as.character(lansing[,"species"]) == "maple", -3]maplesBIC <- EMclust(maples)maplesModel <- summary(maplesBIC, maples)x <- grid1(100, range=c(0,1))y <- xxyDens <- do.call("dens", c(list(data=grid2(x, y)), maplesModel))xyDens <- matrix(xyDens, ncol=100)contour(xyDens)points(maples, cex=.2, col="red")

image(xyDens)points(maples, cex=.5)

hc Model-based Hierarchical Clustering

Description

Agglomerative hierarchical clustering based on maximum likelihood criteria for MVN mixturemodels parameterized by eigenvalue decomposition.

hc(modelName, data, ...)

Arguments

"E" : equal variance (one-dimensional)"V" : spherical, variable variance (one-dimensional)"EII": spherical, equal volume"VII": spherical, unequal volume"EEE": ellipsoidal, equal volume, shape, and orientation"VVV": ellipsoidal, varying volume, shape, and orientation

... Arguments for the method-specific hc functions. SeehcE.

Details

Most models have memory usage of the order of the square of the number groups in the initialpartition for fast execution. Some models, such as equal variance or"EEE" , do not admit a fastalgorithm under the usual agglomerative hierarchical clustering paradigm. These use less memorybut are much slower to execute.

A numeric two-column matrix in which theith row gives the minimum index for observations ineach of the two clusters merged at theith stage of agglomerative hierarchical clustering.

References

J. D. Banfield and A. E. Raftery (1993). Model-based Gaussian and non-Gaussian Clustering.Biometrics 49:803-821.

C. Fraley (1998). Algorithms for model-based Gaussian hierarchical clustering.SIAM Journal onScientific Computing 20:270-281. Seehttp://www.stat.washington.edu/mclust .

If modelName = "E" (univariate with equal variances) ormodelName = "EII" (multivari-ate with equal spherical covariances), then the method is equivalent to Ward’s method for hierarchi-cal clustering.

See Also

hcE,...,hcVVV, hclass

Examples

hcTree <- hc(modelName = "VVV", data = irisMatrix)cl <- hclass(hcTree,c(2,3))

par(pty = "s", mfrow = c(1,1))clPairs(irisMatrix,cl=cl[,"2"])clPairs(irisMatrix,cl=cl[,"3"])

par(mfrow = c(1,2))dimens <- c(1,2)coordProj(irisMatrix, classification=cl[,"2"], dimens=dimens)coordProj(irisMatrix, classification=cl[,"3"], dimens=dimens)

42 hcE

hcE Model-based Hierarchical Clustering

Description

Agglomerative hierarchical clustering based on maximum likelihood for a MVN mixture modelparameterized by eigenvalue decomposition.

hcE(data, partition, minclus=1, ...)hcV(data, partition, minclus = 1, alpha = 1, ...)hcEII(data, partition, minclus = 1, ...)hcVII(data, partition, minclus = 1, alpha = 1, ...)hcEEE(data, partition, minclus = 1, ...)hcVVV(data, partition, minclus = 1, alpha = 1, beta = 1, ...)

Arguments

partition A numeric or character vector representing a partition of observations (rows) ofdata . If provided, group merges will start with this partition. Otherwise, eachobservation is assumed to be in a cluster by itself at the start of agglomeration.

minclus A number indicating the number of clusters at which to stop the agglomeration.The default is to stop when all observations have been merged into a singlecluster.

alpha, beta Additional tuning parameters needed for initializatiion in some models. Fordetails, see Fraley 1998. The defaults provided are usually adequate.

Details

Most models have memory usage of the order of the square of the number groups in the initialpartition for fast execution. Some models, such as equal variance or"EEE" , do not admit a fastalgorithm under the usual agglomerative hierachical clustering paradigm. These use less memorybut are much slower to execute.

A numeric two-column matrix in which theith row gives the minimum index for observations ineach of the two clusters merged at theith stage of agglomerative hierarchical clustering.

References

J. D. Banfield and A. E. Raftery (1993). Model-based Gaussian and non-Gaussian Clustering.Biometrics 49:803-821.

C. Fraley (1998). Algorithms for model-based Gaussian hierarchical clustering.SIAM Journal onScientific Computing 20:270-281. Seehttp://www.stat.washington.edu/mclust .

hclass 43

See Also

hc , hclass

Examples

hcTree <- hcEII(data = irisMatrix)cl <- hclass(hcTree,c(2,3))

par(mfrow = c(1,2))dimens <- c(1,2)coordProj(irisMatrix, classification=cl[,"2"], dimens=dimens)coordProj(irisMatrix, classification=cl[,"3"], dimens=dimens)

hclass Classifications from Hierarchical Agglomeration

Description

Determines the classifications corresponding to different numbers of groups given merge pairs fromhierarchical agglomeration.

hclass(hcPairs, G)

Arguments

hcPairs A numeric two-column matrix in which theith row gives the minimum index forobservations in each of the two clusters merged at theith stage of agglomerativehierarchical clustering.

G An integer or vector of integers giving the number of clusters for which thecorresponding classfications are wanted.

A matrix with length(G) columns, each column corresponding to a classification. Columns areindexed by the character representation of the integers inG.

44 hypvol

References

See Also

hc , hcE

Examples

data(iris)irisMatrix <- iris[,1:4]

hcTree <- hc(modelName="VVV", data = irisMatrix)cl <- hclass(hcTree,c(2,3))

hypvol Aproximate Hypervolume for Multivariate Data

Description

Computes a simple approximation to the hypervolume of a multivariate data set.

hypvol(data, reciprocal=FALSE)

Arguments

reciprocal A logical variable indicating whether or not the reciprocal hypervolume is de-sired rather than the hypervolume itself. The default is to return the approximatehypervolume.

Computes the hypervolume by two methods: simple variable bounds and principal components,and returns the minimum value.

References

lansing 45

Examples

data(iris)irisMatrix <- as.matrix(iris[,1:4])hypvol(irisMatrix)

lansing Maple trees in Lansing Woods

Description

The lansing data frame has 1217 rows and 3 columns. The first two columns give the location,the third column the tree type.

data(lansing)

Format

This data frame contains the following columns:

x a numeric vector

y a numeric vector

speciesa factor with levelshickory andmaple

Source

D.J. Gerrard, Research Bulletin No. 20, Agricultural Experimental Station, Michigan State Univer-sity, 1969.

See Also

grid1 , dens

Examples

data(lansing)plot(lansing[,1:2], pch=as.integer(lansing[,3]),

col=as.integer(lansing[,3]), main="Lansing Woods tree types")

46 map

map Classification given Probabilities

Description

Converts a matrix in which each row sums to1 into the nearest matrix of(0,1) indicator variables.

map(z, warn=TRUE, ...)

Arguments

z A matrix (for example a matrix of conditional probabilities in which each rowsums to 1 as produced by the E-step of the EM algorithm).

warn A logical variable indicating whether or not a warning should be issued whenthere are some columns ofz for which no row attains a maximum.

A integer vector with one entry for each row of z, in which thei-th value is the column index atwhich thei-th row ofz attains a maximum.

References

C. Fraley and A. E. Raftery (2002a). Model-based clustering, discriminant analysis, and densityestimation.Journal of the American Statistical Association 97:611-631.

C. Fraley and A. E. Raftery (2002b). MCLUST:Software for model-based clustering, density esti-mation and discriminant analysis. Technical Report, Department of Statistics, University of Wash-ington.

Seehttp://www.stat.washington.edu/mclust .

See Also

unmap, estep , em, me

Examples

emEst <- me(modelName = "VVV", data = irisMatrix, z = unmap(irisClass))

map(emEst$z)

mapClass 47

mapClass Correspondence between classifications.

Description

Best correspondence between classes given two vectors viewed as alternative classifications of thesame object.

mapClass(a, b)

Arguments

a A numeric or character vector of class labels.

b A numeric or character vector of class labels. Must have the same length asa.

A list with two named elements,aTOb andbTOa which are themselves lists. TheaTOb list has acomponent corresponding to each unique element ofa, which gives the element or elements ofbthat result in the closest class correspondence.

ThebTOa list has a component corresponding to each unique element ofb, which gives the elementor elements ofa that result in the closest class correspondence.

See Also

mapClass , classError , table

Examples

a <- rep(1:3, 3)ab <- rep(c("A", "B", "C"), 3)bmapClass(a, b)a <- sample(1:3, 9, replace = TRUE)ab <- sample(c("A", "B", "C"), 9, replace = TRUE)bmapClass(a, b)

mclust-internal Internal MCLUST functions

Description

Internal tools functions.

Details

These are not to be called by the user directly.

48 mclust1Dplot

mclust1Dplot Plot one-dimensional data modelled by an MVN mixture.

Description

Plot one-dimensional data given parameters of an MVN mixture model for the data.

mclust1Dplot(data, ...,type = c("classification","uncertainty","density","errors"),ask = TRUE, symbols, grid = 100, identify = FALSE, CEX = 1, xlim)

Arguments

data A numeric vector of observations. Categorical variables are not allowed.

z A matrix in which the[i,k] the entry gives the probability of observationibelonging to thekth class. Used to computeclassification anduncertainty if those arguments aren’t available.

mu A vector whose entries are the means of each group.sigma Either a vector whose entries are the variances for each group or a scalar

giving a common variance for the groups.pro The vector of mixing proportions.

type Any subset ofc("classification","uncertainty","density","errors") .The function will produce the corresponding plot if it has been supplied suffi-cient information to do so. If more than one plot is possible then users will beasked to choose from a menu ifask=TRUE.

symbols Either an integer or character vector assigning a plotting symbol to each uniqueclassclassification . Elements insymbols correspond to classes inclassification in order of appearance in the observations (the order usedby the functionunique ). The default is to use a single plotting symbol|.Classes are delineated by showing them in separate lines above the whole of thedata.

grid Number of grid points to use.

xlim An argument specifying bounds of the plot. This may be useful for when com-paring plots.

mclust2Dplot 49

Side Effects

One or more plots showing location of the mixture components, classification, uncertainty, densityand/or classification errors. Points in the different classes are shown in separate lines above thewhole of the data.

References

See Also

mclust2Dplot , clPairs , coordProj , do.call

Examples

n <- 250 ## create artificial dataset.seed(0)y <- c(rnorm(n,-5), rnorm(n,0), rnorm(n,5))yclass <- c(rep(1,n), rep(2,n), rep(3,n))

yEMclust <- summary(EMclust(y),y)

mclust1Dplot(y, identify = TRUE, truth = yclass, z = yEMclust$z, ask=FALSE,mu = yEMclust$mu, sigma = yEMclust$sigma, pro = yEMclust$pro)

do.call("mclust1Dplot",c(list(data = y, identify = TRUE, truth = yclass, ask=FALSE),yEMclust))

mclust2Dplot Plot two-dimensional data modelled by an MVN mixture.

Description

Plot two-dimensional data given parameters of an MVN mixture model for the data.

mclust2Dplot(data, ...,type = c("classification","uncertainty","errors"), ask = TRUE,quantiles = c(0.75, 0.95), symbols, scale = FALSE,identify = FALSE, CEX = 1, PCH = ".", xlim, ylim,swapAxes = FALSE)

50 mclust2Dplot

Arguments

data A numeric matrix or data frame of observations. Categorical variables are notallowed. If a matrix or data frame, rows correspond to observations and columnscorrespond to variables. In this case the data are two dimensional, so there aretwo columns.

mu A matrix whose columns are the means of each group.sigma A three dimensional array in whichsigma[,,k] gives the covariance

for thekth group.decomp A list with scale , shape andorientation components giving

an alternative form for the covariance structure of the mixture model.

symbols Either an integer or character vector assigning a plotting symbol to each uniqueclassclassification . Elements insymbols correspond to classes inclassification in order of appearance in the observations (the order usedby the S-PLUS functionunique ). Default: If G is the number of groups inthe classification, the firstG symbols in.Mclust$symbols , otherwise ifGis less than 27 then the firstG capital letters in the Roman alphabet.

xlim, ylim An argument specifying bounds for the ordinate, abscissa of the plot. This maybe useful for when comparing plots.

mclustDA 51

swapAxes A logical variable indicating whether or not the axes should be swapped for theplot.

Side Effects

One or more plots showing location of the mixture components, classification, uncertainty, and/orclassification errors.

References

See Also

surfacePlot , clPairs , coordProj , randProj , spinProj , mclustOptions , do.call

Examples

matrix(rnorm(n*2), n, 2) %*% diag(c(1,9))[,2:1])xclass <- c(rep(1,n),rep(2,n))

xEMclust <- summary(EMclust(x),x)

mclust2Dplot(x, truth = xclass, z = xEMclust$z, ask=FALSE,mu = xEMclust$mu, sigma = xEMclust$sigma)

do.call("mclust2Dplot", c(list(data = x, truth = xclass, ask=FALSE), xEMclust))

mclustDA MclustDA discriminant analysis.

Description

MclustDA training and testing.

mclustDA(trainingData, labels, testData, G=1:6, verbose = FALSE)

52 mclustDA

Arguments

trainingData A numeric vector, matrix, or data frame of training observations. Categoricalvariables are not allowed. If a matrix or data frame, rows correspond to obser-vations and columns correspond to variables.

labels A numeric or character vector assigning a class label to each training observa-tion.

testData A numeric vector, matrix, or data frame of training observations. Categoricalvariables are not allowed. If a matrix or data frame, rows correspond to obser-vations and columns correspond to variables.

G An integer vector specifying the numbers of mixture components (clusters) tobe considered for each class. Default:1:6 .

verbose A logical variable telling whether or not to print an indication that the functionis in the training phase, which may take some time to complete.

A list with the following components:

testClassificationmclustDA classification of the test data.

trainingClassificationmclustDA classification of the training data.

VofIindex Meila’s Variation of Information index, to compare classification of the trainingdata to the known labels.

summary Gives the best model and number of clusters for each training class.

models The mixture models used to fit the known classes.

postProb A matrix whose[i,k] th entry is the probability that observationi in the test databelongs to thekth class.

Details

The following models are compared inMclust :

"EII": spherical, equal volume"VII": spherical, unequal volume"EEI": diagonal, equal volume, equal shape"VVI": diagonal, varying volume, varying shape"EEE": ellipsoidal, equal volume, shape, and orientation"VVV": ellipsoidal, varying volume, shape, and orientation

mclustDA is a simplified function combiningmclustDAtrain andmclustDAtest and theirsummaries.

References

mclustDAtest 53

M. Meila (2002). Comparing clusterings. Technical Report 418, Department of Statistics, Univer-sity of Washington. Seehttp://www.stat.washington.edu/www/research/reports .

See Also

plot.mclustDA , mclustDAtrain , mclustDAtest , compareClass , classError

Examples

## Not run:par(pty = "s")mclust2Dplot(x, classification = xclass, type="classification", ask=FALSE)## End(Not run)

odd <- seq(from = 1, to = 2*n, by = 2)even <- odd + 1testMclustDA <- mclustDA(trainingData = x[odd, ], labels = xclass[odd],

testData = x[even,])

clEven <- testMclustDA$testClassification ## classify training setcompareClass(clEven,xclass[even])## Not run:plot(testMclustDA, trainingData = x[odd, ], labels = xclass[odd],

testData = x[even,])## End(Not run)

mclustDAtest MclustDA Testing

Description

Testing phase for MclustDA discriminant analysis.

mclustDAtest(data, models)

Arguments

data A numeric vector, matrix, or data frame of observations to be classified.

models A list of MCLUST-style models including parameters, usually the result of ap-plying mclustDAtrain to some training data.

54 mclustDAtrain

A matrix in which the[i,j] th entry is the density for that test observationi in the model for classj.

References

See Also

summary.mclustDAtest , mclustDAtrain

Examples

matrix(rnorm(n*2), n, 2) %*% diag(c(1,9))[,2:1])xclass <- c(rep(1,n),rep(2,n))## Not run:par(pty = "s")mclust2Dplot(x, classification = xclass, type="classification", ask=FALSE)## End(Not run)

odd <- seq(1, 2*n, 2)train <- mclustDAtrain(x[odd, ], labels = xclass[odd]) ## training stepsummary(train)

even <- odd + 1test <- mclustDAtest(x[even, ], train) ## compute model densitiessummary(test)$class ## classify training set

mclustDAtrain MclustDA Training

Description

Training phase for MclustDA discriminant analysis.

mclustDAtrain(data, labels, G, emModelNames, eps, tol, itmax,equalPro, warnSingular, verbose)

mclustDAtrain 55

Arguments

labels A numeric or character vector assigning a class label to each observation.

G An integer vector specifying the numbers of Gaussian mixture components (clus-ters) for which the BIC is to be calculated (the same specification is used for allclasses). Default:1:9.

emModelNames A vector of character strings indicating the models to be fitted in the EM phaseof clustering. Possible models:"E" for spherical, equal variance (one-dimensional)"V" for spherical, variable variance (one-dimensional)"EII": spherical, equal volume"VII": spherical, unequal volume"EEI": diagonal, equal volume, equal shape"VEI": diagonal, varying volume, equal shape"EVI": diagonal, equal volume, varying shape"VVI": diagonal, varying volume, varying shape"EEE": ellipsoidal, equal volume, shape, and orientation"EEV": ellipsoidal, equal volume and equal shape"VEV": ellipsoidal, equal shape"VVV": ellipsoidal, varying volume, shape, and orientation

verbose A logical value indicating whether or not to print the models and numbers ofcomponents for each class. Default:verbose=TRUE .

A list in which each element gives the optimal parameters for the model best fitting each classaccording to BIC.

References

56 mclustOptions

See Also

summary.mclustDAtrain , mclustDAtest , EMclust , hc , mclustOptions

Examples

n <- 250 ## create artificial dataset.seed(0)par(pty = "s")x <- rbind(matrix(rnorm(n*2), n, 2) %*% diag(c(1,9)),

matrix(rnorm(n*2), n, 2) %*% diag(c(1,9))[,2:1])xclass <- c(rep(1,n),rep(2,n))## Not run:mclust2Dplot(x, classification = xclass, type="classification", ask=FALSE)## End(Not run)

even <- odd + 1test <- mclustDAtest(x[even, ], train) ## compute model densitiesclEven <- summary(test)$class ## classify training setcompareClass(clEven,xclass[even])

mclustOptions Set control values for use with MCLUST.

Description

Supplies a list of values including tolerances for singularity and convergence assessment, and anenumeration of models for use withMCLUST.

mclustOptions(eps, tol, itmax, equalPro, warnSingular, emModelNames,hcModelName, symbols)

Arguments

eps A scalar tolerance associated with deciding when to terminate computationsdue to computational singularity in covariances. Smaller values ofeps allowcomputations to proceed nearer to singularity. The default is the relative ma-chine precision.Machine$double.eps , which is approximately $2e-16$on IEEE-compliant machines.

tol A vector of length two giving relative convergence tolerances for the loglikeli-hood and for parameter convergence in the inner loop for models with iterativeM-step ("VEI", "VEE", "VVE", "VEV"), respectively. The default isc(1.e-5,1.e-5) .

itmax A vector of length two giving integer limits on the number of EM iterations andon the number of iterations in the inner loop for models with iterative M-step("VEI", "VEE", "VVE", "VEV"), respectively. The default isc(Inf,Inf)allowing termination to be completely governed bytol .

mclustOptions 57

equalPro Logical variable indicating whether or not the mixing proportions are equal inthe model. Default:equalPro = FALSE .

warnSingular A logical value indicating whether or not a warning should be issued whenevera singularity is encountered. The default iswarnSingular = TRUE .

emModelNames A vector of character strings associated with multivariate models in MCLUST.The default includes strings encoding all of the multivariate models available:

hcModelName A vector of two character strings giving the name of the model to be used in thehierarchical clustering phase for univariate and multivariate data, respectively,in EMclust andEMclustN . The default isc("V","VVV") , giving the un-constrained model in each case.

symbols A vector whose entries are either integers corresponding to graphics symbols orsingle characters for plotting for classifications. Classes are assigned symbols inthe given order. The default isc(17,0,10,4,11,18,6,7,3,16,2,12,8,15,1,9,14,13,5) .

Details

mclustOptions is provided for assigning values to the.Mclust list, which is used to supplydefault values to various functions in MCLUST.

Calls tomclustOptions do not in themselves affect the outcome of computations.

A named list in which the names are the names of the arguments and the values are the valuessupplied to the arguments.

References

See Also

.Mclust

Examples

.Mclust

.Mclust <- mclustOptions(tol = 1.e-6, emModelNames = c("VII", "VVI", "VVV"))

.MclustirisBic <- EMclust(irisMatrix)summary(irisBic, irisMatrix).Mclust <- mclustOptions() # restore default values.Mclust

me EM algorithm starting with M-step for parameterized MVN mixturemodels.

Description

Implements the EM algorithm for parameterized MVN mixture models, starting with the maximiza-tion step.

me(modelName, data, z, ...)

Arguments

modelName A character string indicating the model:"E": equal variance (one-dimensional)"V": variable variance (one-dimensional)"EII": spherical, equal volume"VII": spherical, unequal volume"EEI": diagonal, equal volume and shape"VEI": diagonal, varying volume, equal shape"EVI": diagonal, equal volume, varying shape"VVI": diagonal, varying volume and shape"EEE": ellipsoidal, equal volume, shape, and orientation"EEV": ellipsoidal, equal volume and equal shape"VEV": ellipsoidal, equal shape"VVV": ellipsoidal, varying volume, shape, and orientation

z A matrix whose[i,k] th entry is the conditional probability of the ith observa-tion belonging to thekth component of the mixture.

... Any number of the following:

eps A scalar tolerance for deciding when to terminate computations due to com-putational singularity in covariances. Smaller values ofeps allow compu-tations to proceed nearer to singularity. The default is.Mclust$eps .

For those models with iterative M-step ("VEI", "VEV"), two values can beentered foreps , in which case the second value is used for determiningsingularity in the M-step.

tol A scalar tolerance for relative convergence of the loglikelihood. The defaultis .Mclust$tol .For those models with iterative M-step ("VEI", "VEV"), two values can beentered fortol , in which case the second value governs parameter conver-gence in the M-step.

itmax An integer limit on the number of EM iterations. The default is.Mclust$itmax .For those models with iterative M-step ("VEI", "VEV"), two values can beentered foritmax , in which case the second value is an upper limit on thenumber of iterations in the M-step.

equalProLogical variable indicating whether or not the mixing proportions are equalin the model. The default is.Mclust$equalPro .

warnSingularA logical value indicating whether or not a warning should be issued when-ever a singularity is encountered. The default is.Mclust$warnSingular .

noise A logical value indicating whether or not the model includes a Poisson noisecomponent. The default assumes there is no noise component.

Vinv An estimate of the reciprocal hypervolume of the data region. The defaultis determined by applying functionhypvol to the data. Used only whennoise = TRUE .

Attributes: "info" Information on the iteration.

"warn" An appropriate warning if problems are encountered in the computations.

References

60 meE

See Also

meE,...,meVVV, em, mstep , estep , mclustOptions

Examples

me(modelName = "VVV", data = irisMatrix, z = unmap(irisClass))

meE EM algorithm starting with M-step for a parameterized MVN mixturemodel.

Description

Implements the EM algorithm for a parameterized MVN mixture model, starting with the maxi-mization step.

meE(data, z, eps, tol, itmax, equalPro, warnSingular,noise = FALSE, Vinv)

meV(data, z, eps, tol, itmax, equalPro, warnSingular,noise = FALSE, Vinv)

meEII(data, z, eps, tol, itmax, equalPro, warnSingular,noise = FALSE, Vinv)

meVII(data, z, eps, tol, itmax, equalPro, warnSingular,noise = FALSE, Vinv)

meEEI(data, z, eps, tol, itmax, equalPro, warnSingular,noise = FALSE, Vinv)

meVEI(data, z, eps, tol, itmax, equalPro, warnSingular,noise = FALSE, Vinv)

meEVI(data, z, eps, tol, itmax, equalPro, warnSingular,noise = FALSE, Vinv)

meVVI(data, z, eps, tol, itmax, equalPro, warnSingular,noise = FALSE, Vinv)

meEEE(data, z, eps, tol, itmax, equalPro, warnSingular,noise = FALSE, Vinv)

meEEV(data, z, eps, tol, itmax, equalPro, warnSingular,noise = FALSE, Vinv)

meVEV(data, z, eps, tol, itmax, equalPro, warnSingular,noise = FALSE, Vinv)

meVVV(data, z, eps, tol, itmax, equalPro, warnSingular,noise = FALSE, Vinv)

Arguments

meE 61

eps A scalar tolerance for deciding when to terminate computations due to compu-tational singularity in covariances. Smaller values ofeps allows computationsto proceed nearer to singularity. The default is.Mclust$eps .

tol A scalar tolerance for relative convergence of the loglikelihood values. Thedefault is.Mclust$tol .

Vinv An estimate of the reciprocal hypervolume of the data region. The default isdetermined by applying functionhypvol to the data. Used only whennoise= TRUE.

Attributes: The return value also has the following attributes:

"info" : Information on the iteration.

"warn" : An appropriate warning if problems are encountered in the computa-tions.

References

62 mstep

See Also

em, me, estep , mclustOptions

Examples

meVVV(data = irisMatrix, z = unmap(irisClass))

mstep M-step in the EM algorithm for parameterized MVN mixture models.

Description

Maximization step in the EM algorithm for parameterized MVN mixture models.

mstep(modelName, data, z, ...)

Arguments

"E": equal variance (one-dimensional)"V": variable variance (one-dimensional)"EII": spherical, equal volume"VII": spherical, unequal volume"EEI": diagonal, equal volume and shape"VEI": diagonal, varying volume, equal shape"EVI": diagonal, equal volume, varying shape"VVI": diagonal, varying volume and shape"EEE": ellipsoidal, equal volume, shape, and orientation"EEV": ellipsoidal, equal volume and equal shape"VEV": ellipsoidal, equal shape"VVV": ellipsoidal, varying volume, shape, and orientation

equalPro A logical value indicating whether or not the components in the modelare present in equal proportions. The default is.Mclust$equalPro .

noise A logical value indicating whether or not the model includes a Poissonnoise component. The default assumes there is no noise component.

mstep 63

eps A scalar tolerance for deciding when to terminate computations due to com-putational singularity in covariances. Smaller values ofeps allows com-putations to proceed nearer to singularity. The default is.Mclust$eps .Not used for models "EII", "VII", "EEE", "VVV".

tol For models with iterative M-step ("VEI", "VEE", "VVE", "VEV"), a scalartolerance for relative convergence of the parameters. The default is.Mclust$tol .

itmax For models with iterative M-step ("VEI", "VEE", "VVE", "VEV"), an in-teger limit on the number of EM iterations. The default is.Mclust$itmax .

warnSingular A logical value indicating whether or not a warning should be is-sued whenever a singularity is encountered. The default is.Mclust$warnSingular .Not used for models "EII", "VII", "EEE", "VVV".

Attributes:

"info" : Information on the iteration.

"warn" : An appropriate warning if problems are encountered in the computa-tions.

References

See Also

mstepE , . . . ,mstepVVV, me, estep , mclustOptions .

Examples

mstep(modelName = "VII", data = irisMatrix, z = unmap(irisClass))

64 mstepE

mstepE M-step in the EM algorithm for a parameterized MVN mixture model.

Description

Maximization step in the EM algorithm for a parameterized MVN mixture model.

mstepE(data, z, equalPro, noise = FALSE, ...)mstepV(data, z, equalPro, noise = FALSE, ...)mstepEII(data, z, equalPro, noise = FALSE, ...)mstepVII(data, z, equalPro, noise = FALSE, ...)mstepEEI(data, z, equalPro, noise = FALSE, eps, warnSingular, ...)mstepVEI(data, z, equalPro, noise = FALSE, eps, tol, itmax, warnSingular, ...)mstepEVI(data, z, equalPro, noise = FALSE, eps, warnSingular, ...)mstepVVI(data, z, equalPro, noise = FALSE, eps, warnSingular, ...)mstepEEE(data, z, equalPro, noise = FALSE, ...)mstepEEV(data, z, equalPro, noise = FALSE, eps, warnSingular, ...)mstepVVV(data, z, equalPro, noise = FALSE, ...)

Arguments

equalPro A logical value indicating whether or not the components in the model arepresent in equal proportions. The default is.Mclust$equalPro .

eps A scalar tolerance for deciding when to terminate computations due to compu-tational singularity in covariances. Smaller values ofeps allows computationsto proceed nearer to singularity. The default is.Mclust$eps .

Not used for models "EII", "VII", "EEE", "VVV".

tol For models with iterative M-step ("VEI", "VEE", "VVE", "VEV"), a scalar tol-erance for relative convergence of the parameters. The default is.Mclust$tol .

itmax For models with iterative M-step ("VEI", "VEE", "VVE", "VEV"), an integerlimit on the number of EM iterations. The default is.Mclust$itmax .

Not used for models "EII", "VII", "EEE", "VVV".

mvn 65

Attributes:

"info" Information on the iteration.

"warn" An appropriate warning if problems are encountered in the computa-tions.

References

See Also

mstep , me, estep , mclustOptions

Examples

mstepVII(data = irisMatrix, z = unmap(irisClass))

mvn Multivariate Normal Fit

Description

Computes the mean, covariance, and loglikelihood from fitting a single MVN or Gaussian to givendata.

66 mvn

mvn( modelName, data)

Arguments

modelName A character string representing a model name. This can be either"Spherical" ,"Diagonal" , or "Ellipsoidal" or an MCLUST-style model name:"E", "V", "X" (one-dimensional)"EII", "VII", "XII" (spherical)"EEI", "VEI", "EVI", "VVI", "XXI" (diagonal)"EEE", "EEV", "VEV", "VVV", "XXX" (ellipsoidal)

A list of including the parameters of the Gaussian model best fitting the data, and the correspondingloglikelihood for the data under the model.

References

See Also

mvnX, mvnXII , mvnXXI , mvnXXX, mstep

Examples

n <- 1000

set.seed(0)x <- rnorm(n, mean = -1, sd = 2)mvn(modelName = "X", x)

mu <- c(-1, 0, 1)

set.seed(0)x <- sweep(matrix(rnorm(n*3), n, 3) %*% (2*diag(3)),

MARGIN = 2, STATS = mu, FUN = "+")mvn(modelName = "XII", x)mvn(modelName = "Spherical", x)

set.seed(0)x <- sweep(matrix(rnorm(n*3), n, 3) %*% diag(1:3),

MARGIN = 2, STATS = mu, FUN = "+")mvn(modelName = "XXI", x)mvn(modelName = "Diagonal", x)

mvnX 67

Sigma <- matrix(c(9,-4,1,-4,9,4,1,4,9), 3, 3)set.seed(0)x <- sweep(matrix(rnorm(n*3), n, 3) %*% chol(Sigma),

MARGIN = 2, STATS = mu, FUN = "+")mvn(modelName = "XXX", x)mvn(modelName = "Ellipsoidal", x)

mvnX Multivariate Normal Fit

Description

Computes the mean, covariance, and loglikelihood from fitting a single MVN or Gaussian.

mvnX(data)mvnXII(data)mvnXXI(data)mvnXXX(data)

Arguments

Details

mvnXII computes the best fitting Gaussian with the covariance restricted to be a multiple of theidentity. mvnXXI computes the best fitting Gaussian with the covariance restricted to be diagonal.mvnXXXcomputes the best fitting Gaussian with ellipsoidal (unrestricted) covariance.

A list of including the parameters of the Gaussian model best fitting the data, and the correspondingloglikelihood for the data under the model.

References

See Also

mvn, mstepE

68 partconv

Examples

n <- 1000

set.seed(0)x <- rnorm(n, mean = -1, sd = 2)mvnX(x)

mu <- c(-1, 0, 1)

set.seed(0)x <- sweep(matrix(rnorm(n*3), n, 3) %*% (2*diag(3)),

MARGIN = 2, STATS = mu, FUN = "+")mvnXII(x)

set.seed(0)x <- sweep(matrix(rnorm(n*3), n, 3) %*% diag(1:3),

MARGIN = 2, STATS = mu, FUN = "+")mvnXXI(x)

Sigma <- matrix(c(9,-4,1,-4,9,4,1,4,9), 3, 3)set.seed(0)x <- sweep(matrix(rnorm(n*3), n, 3) %*% chol(Sigma),

MARGIN = 2, STATS = mu, FUN = "+")mvnXXX(x)

partconv Convert partitioning into numerical vector.

Description

partconv converts a partitioning into a numerical vector. The second argument is used to forceconsecutive numbers (default) or not.

partconv(x, consec=TRUE)

Arguments

x Partitioning. Maybe numerical or not.

consec Logical flag, whether or not to use consecutive class numbers.

Vector of class numbers.

Examples

data(iris)partconv(iris[,5])

cl <- sample(1:10, 25, replace=TRUE)partconv(cl, consec=FALSE)partconv(cl, consec=TRUE)

partuniq 69

partuniq Classifies Data According to Unique Observations

Description

Gives a one-to-one mapping from unique observations to rows of a data matrix.

partuniq(x)

Arguments

x Matrix of observations.

A vector of lengthnrow(x) with integer entries. An observationk is assigned an integeri when-ever observationi is the first row ofx that is identical to observationk (note thati <= k ).

Examples

data(iris)partuniq(as.matrix(iris[,1:4]))

plot.Mclust Plot Model-Based Clustering Results

Description

Plot model-based clustering results: BIC, classification, uncertainty and (for one- and two-dimensionaldata) density.

plot.Mclust(x, data, dimens = c(1, 2), scale = FALSE, ...)

Arguments

x Output fromMclust .

data The data used to producex .

dimens An integer vector of length two specifying the dimensions for coordinate pro-jections if the data is more than two-dimensional. The default isc(1,2) (thefirst two dimesions).

... Further arguments to the lower level plotting functions.

70 plot.mclustDA

Plots selected via a menu including the following options: BIC values used for choosing the numberof clusters For data in more than two dimensions, a pairs plot of the showing the classification, co-ordinate projections of the data, showing location of the mixture components, classification, and/oruncertainty. For one- and two- dimensional data, plots showing location of the mixture components,classification, uncertainty, and or density.

References

See Also

Mclust

Examples

data(iris)irisMatrix <- as.matrix(iris[,1:4])irisMclust <- Mclust(irisMatrix)

## Not run: plot(irisMclust,irisMatrix)

plot.mclustDA Plotting method for MclustDA discriminant analysis.

Description

Plots training and test data, known training data classification, mclustDA test data classification,and/or training errors.

plot.mclustDA(x, trainingData, labels, testData, dimens=c(1,2),scale = FALSE, identify=FALSE, ...)

Arguments

x The object produced by applyingmclustDA with trainingData and clas-sificationlabels to testData .

trainingData The numeric vector, matrix, or data frame of training observations used to obtainx .

labels The numeric or character vector assigning a class label to each training observa-tion.

plot.mclustDA 71

testData A numeric vector, matrix, or data frame of training observations. Categoricalvariables are not allowed. If a matrix or data frame, rows correspond to obser-vations and columns correspond to variables.

dimens An integer vector of length two specifying the dimensions for coordinate pro-jections if the data is more than two-dimensional. The default isc(1,2) (thefirst two dimesions).

identify A logical variable indicating whether or not to print a title identifying the plot.Default: identify=FALSE

... Further arguments to the lower level plotting functions.

Plots selected via a menu including the following options: training and test data, known trainingdata classification, mclustDA test data classification, training errors.

References

See Also

mclustDA

Examples

matrix(rnorm(n*2), n, 2) %*% diag(c(1,9))[,2:1])xclass <- c(rep(1,n),rep(2,n))## Not run:mclust2Dplot(x, classification = xclass, type="classification", ask=FALSE)## End(Not run)odd <- seq(from = 1, to = 2*n, by = 2)even <- odd + 1testMclustDA <- mclustDA(trainingData = x[odd, ], labels = xclass[odd],

testData = x[even,])

clEven <- testMclustDA$testClassification ## classify training setcompareClass(clEven,xclass[even])

## Not run:plot(testMclustDA, trainingData = x[odd, ], labels = xclass[odd],testData = x[even,])## End(Not run)

72 randProj

randProj Random projections for data in more than two dimensions modelledby an MVN mixture.

Description

Plots random projections given data in more than two dimensions and parameters of an MVN mix-ture model for the data.

randProj(data, seeds = 0, ...,type = c("classification", "uncertainty", "errors"), ask = TRUE,quantiles = c(0.75,0.95), symbols, scale = FALSE, identify = FALSE,CEX = 1, PCH = ".", xlim, ylim)

Arguments

seeds A vector of integers between 0 and 1000, specifying seeds for the random pro-jections. The default value is the single seed 0.

mu A matrix whose columns are the means of each group.sigma A three dimensional array in whichsigma[,,k] gives the covariance

for thekth group.decomp A list with scale , shape andorientation components giving

an alternative form for the covariance structure of the mixture model.

randProj 73

symbols Either an integer or character vector assigning a plotting symbol to each uniqueclassclassification . Elements insymbols correspond to classes inclassification in order of appearance inclassification (the orderused by the S-PLUS functionunique ). Default: If G is the number of groupsin the classification, the firstG symbols in.Mclust$symbols , otherwise ifG is less than 27 then the firstG capital letters in the Roman alphabet.

Random projections of the data, possibly showing location of the mixture components, classifica-tion, uncertainty, and classficaition errors.

References

See Also

coordProj , spinProj , mclust2Dplot , mclustOptions , do.call ,

Examples

par(pty = "s", mfrow = c(2,3))randProj(irisMatrix, seeds = 0:5, truth=irisClass,

mu = msEst$mu, sigma = msEst$sigma, z = msEst$z)do.call("randProj", c(list(data = irisMatrix, seeds = 0:5, truth=irisClass),

msEst))

74 sigma2decomp

sigma2decomp Convert mixture component covariances to decomposition form.

Description

Converts a set of covariance matrices from representation as a 3-D array to a parameterization byeigenvalue decomposition.

sigma2decomp(sigma, G, tol, ...)

Arguments

sigma Either a 3-D array whose [„k]th component is the covariance matrix for the kthcomponent in an MVN mixture model, or a single covariance matrix in the casethat all components have the same covariance.

G The number of components in the mixture. Whensigma is a 3-D array, thenumber of components can be inferred from its dimensions.

tol Tolerance for determining whether or not the covariances have equal volume,shape, and or orientation. The default is the square root of the relative machineprecision,sqrt(.Machine$double.eps) , which is about1.e-8 .

The covariance matrices for the mixture components in decomposition form, including the follow-ing components:

scale Either aG-vector giving the scale of the covariance (thedth root of its determi-nant) for each component in the mixture model, or a single numeric value if thescale is the same for each component.

shape Either aG by d matrix in which thekth column is the shape of the covariancematrix (normalized to have determinant 1) for thekth component, or ad-vectorgiving a common shape for all components.

orientation Either ad by d by G array whose[,,k] th entry is the orthonomal matrix ofeigenvectors of the covariance matrix of thekth component, or ad by d or-thonormal matrix if the mixture components have a common orientation. Theorientation component ofdecomp can be omitted in spherical and diag-onal models, for which the principal components are parallel to the coordinateaxes so that the orientation matrix is the identity.

sim 75

References

C. Fraley and A. E. Raftery (2002b). MCLUST:Software for model-based clustering, density esti-mation, and discriminant analysis. Technical Report, Department of Statistics, University of Wash-ington. Seehttp://www.stat.washington.edu/mclust .

See Also

decomp2sigma

Examples

meEst <- meEEE(irisMatrix, unmap(irisClass))names(meEst)meEst$sigma

sigma2decomp(meEst$sigma)## Not run:do.call("sigma2decomp", meEst) ## alternative call## End(Not run)

sim Simulate from Parameterized MVN Mixture Models

Description

Simulate data from parameterized MVN mixture models.

sim(modelName, mu, ..., seed = 0)

Arguments

"EII": spherical, equal volume"VII": spherical, unequal volume"EEI": diagonal, equal volume, equal shape"VEI": diagonal, varying volume, equal shape"EVI": diagonal, equal volume, varying shape"VVI": diagonal, varying volume, varying shape"EEE": ellipsoidal, equal volume, shape, and orientation

76 sim

"EEV": ellipsoidal, equal volume and equal shape"VEV": ellipsoidal, equal shape"VVV": ellipsoidal, varying volume, shape, and orientation

decomp for the diagonal models ("EEI", "VEI", "EVI", "VVI") and someellipsoidal models ("EEV", "VEV"). This is a list described incdens .

pro Component mixing proportions. If missing, equal proportions are assumed.

n An integer specifying the number of data points to be simulated.

seed A integer between 0 and 1000, inclusive, for specifying a seed for random classassignment. The default value is 0.

Details

This function can be used with an indirect or list call usingdo.call , allowing the output ofe.g. mstep , em, me, or EMclust to be passed directly without the need to specify individualparameters as arguments.

A data set consisting of n points simulated from the specified MVN mixture model.

References

See Also

simE , . . . ,simVVV, EMclust , mstep , do.call

simE 77

Examples

irisBic <- EMclust(irisMatrix)irisSumry <- summary(irisBic,irisMatrix)names(irisSumry)irisSim <- sim(modelName = irisSumry$modelName, n = dim(irisMatrix)[1],

mu = irisSumry$mu, decomp = irisSumry$decomp, pro = irisSumry$pro)## Not run:irisSim <- do.call("sim", irisSumry) ## alternative call## End(Not run)

par(pty = "s", mfrow = c(1,2))dimens <- c(1,2)xlim <- range(rbind(irisMatrix,irisSim)[,dimens][,1])ylim <- range(rbind(irisMatrix,irisSim)[,dimens][,2])

cl <- irisSumry$classificationcoordProj(irisMatrix, par=irisSumry, classification=cl, dimens=dimens,

xlim=xlim, ylim=ylim)cl <- attr(irisSim,"classification")coordProj(irisSim, par=irisSumry, classification=cl, dimens=dimens,

xlim=xlim, ylim=ylim)

irisSumry3 <- summary(irisBic,irisMatrix, G=3)irisSim3 <- do.call("sim", c(list(n = 500, seed = 1), irisSumry3))clPairs(irisSim3, cl = attr(irisSim3,"classification"))

simE Simulate from a Parameterized MVN Mixture Model

Description

Simulate data from a parameterized MVN mixture model.

simE(mu, sigmasq, pro, ..., seed = 0)simV(mu, sigmasq, pro, ..., seed = 0)simEII(mu, sigmasq, pro, ..., seed = 0)simVII(mu, sigmasq, pro, ..., seed = 0)simEEI(mu, decomp, pro, ..., seed = 0)simVEI(mu, decomp, pro, ..., seed = 0)simEVI(mu, decomp, pro, ..., seed = 0)simVVI(mu, decomp, pro, ..., seed = 0)simEEE(mu, pro, ..., seed = 0)simEEV(mu, decomp, pro, ..., seed = 0)simVEV(mu, decomp, pro, ..., seed = 0)simVVV mu, pro, ..., seed = 0)

78 simE

Arguments

decomp for the diagonal models ("EEI", "VEI", "EVI", "VVI") and some ellipsoidalmodels ("EEV", "VEV"). This is a list described incdens .

pro Component mixing proportions. If missing, equal proportions are assumed.

Other terms describing variance:

Sigma for the equal variance model "EEE". Ad by d matrix giving the commoncovariance for all components of the mixture model.

sigma for the unconstrained variance model "VVV". Ad by d by G matrixarray whose[,,k] th entry is the covariance matrix for thekth componentof the mixture model.The form of the variance specification is the same as for the output for theem, me, or mstep methods for the specified mixture model.

n An integer specifying the number of data points to be simulated.

seed A integer between 0 and 1000, inclusive, for specifying a seed for random classassignment. The default value is 0.

Details

This function can be used with an indirect or list call usingdo.call , allowing the output ofe.g. mstep , em me, or EMclust , to be passed directly without the need to specify individualparameters as arguments.

A data set consisting ofn points simulated from the specified MVN mixture model.

References

See Also

sim , EMclust , mstepE , do.call

spinProj 79

Examples

d <- 2G <- 2scale <- 1shape <- c(1, 9)

O1 <- diag(2)O2 <- diag(2)[,c(2,1)]O <- array(cbind(O1,O2), c(2, 2, 2))O

decomp <- list(d= d, G = G, scale = scale, shape = shape, orientation = O)mu <- matrix(0, d, G) ## center at the originsimdat <- simEEV(n=200, mu=mu, decomp=decomp, pro = c(1,1))

cl <- attr(simdat, "classification")sigma <- array(apply(O, 3, function(x,y) crossprod(x*y),

y = sqrt(scale*shape)), c(2,2,2))paramList <- list(mu = mu, sigma = sigma)coordProj( simdat, paramList = paramList, classification = cl)

spinProj Planar spin for random projections of data in more than two dimen-sions modelled by an MVN mixture.

Description

Plots random 2-D projections with suggessive rotations through a specified angles given data inmore than two dimensions and parameters of an MVN mixture model.

spinProj(data, ..., angles, seed = 0, reflection = FALSE,type = c("classification", "uncertainty", "errors"),ask = TRUE, quantiles = c(0.75,0.95), symbols, scale = FALSE,identify = FALSE, CEX = 1, PCH = ".", xlim, ylim)

Arguments

80 spinProj

angles The angles (in radians) through which successive projections should be rotatedor reflected.

seed A integer between 0 and 1000, inclusive, for specifying a seed for generatingthe initial random projection. The default value is 0. The seed/projection corre-spondence is the same as inrandProj .

reflection A logical variable telling whether or not the data should be reflected or rotatedthrough the given angles. The default is rotation.

symbols Either an integer or character vector assigning a plotting symbol to each uniqueclassclassification . Elements insymbols correspond to classes inclassification in order of appearance inclassification (the orderused by the S-PLUS functionunique ). Default: If G is the number of groupsin the classification, the firstG symbols in.Mclust$symbols , otherwise ifG is less than 27 then the firstG capital letters in the Roman alphabet.

Rotations or reflections of a random projection of the data, possibly showing location of the mixturecomponents, classification, uncertainty and/or classfication errors.

summary.EMclust 81

References

See Also

coordProj , randProj , mclust2Dplot , mclustOptions , do.call

Examples

par(pty = "s", mfrow = c(2,2))spinProj(irisMatrix, seed = 1, truth=irisClass,

mu = msEst$mu, sigma = msEst$sigma, z = msEst$z)do.call("spinProj", c(list(data = irisMatrix, seeds = 2, truth=irisClass),

msEst))

summary.EMclust Summary function for EMclust

Description

Optimal model characteristics and classification forEMclust results.

summary.EMclust(object, data, G, modelNames, ...)

Arguments

object An "EMclust" object, which is the result of applyingEMclust to data .

data The matrix or vector of observations used to generate ‘object’.

G A vector of integers giving the numbers of mixture components (clusters) overwhich the summary is to take place (as.character(G) must be a subset ofthe column names ofobject ). The default is to summarize over all of thenumbers of mixture components used in the original analysis.

modelNames A vector of character strings denoting the models over which the summary isto take place (must be a subset of the row names of ‘object’). The default is tosummarize over all models used in the original analysis.

... Not used. For generic/method consistency.

82 summary.EMclustN

A list giving the optimal (according to BIC) parameters, conditional probabilitiesz , and loglikeli-hood, together with the associated classification and its uncertainty.

References

See Also

EMclust

Examples

irisBic <- EMclust(irisMatrix)summary(irisBic, irisMatrix)summary(irisBic, irisMatrix, G = 1:6, modelName = c("VII", "VVI", "VVV"))

summary.EMclustN summary function for EMclustN

Description

Optimal model characteristics and classification forEMclustN results.

summary.EMclustN(object, data, G, modelNames, ...)

Arguments

object An "EMclustN" object, whch is the result of a pplyingEMclustN to datawith an initail noise estimate.

data The matrix or vector of observations used to generate ‘object’.

G A vector of integers giving the numbers of mixture components (clusters) overwhich the summary is to take place (as.character(G) must be a subsetof the column names of ‘object’). The default is to summarize over all of thenumbers of mixture components used in the original analysis.

modelNames A vector of character strings denoting the models over which the summary isto take place (must be a subset of the row names of ‘object’). The default is tosummarize over all models used in the original analysis.

summary.Mclust 83

A list giving the optimal (according to BIC) parameters, conditional probabilitiesz , and loglikeli-hood, together with the associated classification and its uncertainty.

References

See Also

EMclustN

Examples

b <- apply( irisMatrix, 2, range)n <- 450set.seed(0)poissonNoise <- apply(b, 2, function(x, n=n)

runif(n, min = x[1]-0.1, max = x[2]+.1), n = n)set.seed(0)noiseInit <- sample(c(TRUE,FALSE),size=150+450,replace=TRUE,prob=c(3,1))irisNoise <- rbind(irisMatrix, poissonNoise)

Bic <- EMclustN(data=irisNoise, noise = noiseInit)summary(Bic, irisNoise)summary(Bic, irisNoise, G = 0:6, modelName = c("VII", "VVI", "VVV"))

summary.Mclust Very brief summary of an Mclust object.

Description

Function gives a brief summary of an Mclust object: the type of model that is picked and the numberof clusters.

summary.Mclust(object, ...)

Arguments

object The result of a call to functionMclust .

... Not used.

84 summary.mclustDAtest

summary.mclustDAtestClassification and posterior probability from mclustDAtest.

Description

Classifications frommclustDAtest and the corresponding posterior probabilities.

summary.mclustDAtest(object, pro, ...)

Arguments

object The output ofmclustDAtest .

pro Prior probabilities for each class in the training data.

A list with the following two components:

classficationThe classification frommclustDAtest

z Matrix of posterior probabilities in which the[i,j] th entry is the probabilityof observationi belonging to classj.

References

See Also

mclustDAtest

Examples

set.seed(0)n <- 100 ## create artificial data

x <- rbind(matrix(rnorm(n*2), n, 2) %*% diag(c(1,9)),matrix(rnorm(n*2), n, 2) %*% diag(c(1,9))[,2:1])

xclass <- c(rep(1,n),rep(2,n))## Not run:par(pty = "s")mclust2Dplot(x, classification = xclass, type="classification", ask=FALSE)## End(Not run)

summary.mclustDAtrain 85

even <- seq(1, 2*n, 2)test <- mclustDAtest(x[even, ], train) ## compute model densitiestestSummary <- summary(test) ## classify training set

names(testSummary)testSummary$classtestSummary$z

summary.mclustDAtrainModels and classifications from mclustDAtrain

Description

The models selected inmclustDAtrain and the corresponding classfications.

summary.mclustDAtrain(object, ...)

Arguments

object The output ofmclustDAtrain .

A list identifying the model selected bymclustDAtrain for each class of training data and thecorresponding classification.

References

See Also

mclustDAtrain

86 surfacePlot

Examples

set.seed(0)n <- 100 ## create artificial data

x <- rbind(matrix(rnorm(n*2), n, 2) %*% diag(c(1,9)),matrix(rnorm(n*2), n, 2) %*% diag(c(1,9))[,2:1])

xclass <- c(rep(1,n),rep(2,n))## Not run:par(pty = "s")mclust2Dplot(x, classification = xclass, type="classification", ask=FALSE)## End(Not run)

surfacePlot Density or uncertainty surface for two dimensional mixtures.

Description

Plots a density or uncertainty surface given data in more than two dimensions and parameters of anMVN mixture model for the data.

surfacePlot(data, mu, pro, ..., type = c("contour", "image", "persp"),what = c("density", "uncertainty", "skip"),transformation = c("none", "log", "sqrt"),grid = 50, nlevels = 20, scale = FALSE, identify = FALSE,verbose = FALSE, xlim, ylim, swapAxes = FALSE)

Arguments

pro A list with scale , shape andorientation components giving an alterna-tive form for the covariance structure of the mixture model.

... An argument specifying the covariance structure of the model. If used an indi-rect function call viado.call (see example below), it is usually not necessaryto know the precise form for this argument. This argument usually take one ofthe following forms:

type Any subset ofc("contour","image","persp") indicating the plot type.For more than one selection, users will be asked to choose from a menu.

surfacePlot 87

what Any subset ofc("density","uncertainty","skip") indicating whatto plot. For more than one selection, users will be asked to choose from a menu.The "skip" produces and empty plot, which may be useful if multiple plotsare displayed simultaneously.

transformationAny subset ofc("none","log","sqrt") indicating a transformation tobe applied to the surface values before plotting. For more than one selection,users will be asked to choose from a menu.

grid The number of grid points (evenly spaced on each axis). The mixture densityand uncertainty is computed atgrid x grid points to produce the surfaceplot. Default:50 .

nlevels The number of levels to use for a contour plot. Default:20 .

scale A logical variable indicating whether or not the two chosen dimensions shouldbe plotted on the same scale, and thus preserve the shape of the distribution.Default: scale=F

verbose A logical variable telling whether or not to print an indication that the functionis in the process of computing values at the grid points, which typically takessome time to complete.

xlim, ylim An argument specifying bounds for the ordinate, abscissa of the plot. This maybe useful for when comparing plots.

swapAxes A logical variable indicating whether or not the axes should be swapped for theplot.

An invisible list with components x, y, and z in which x and y are the values used to define the gridand z is the transformed density or uncertainty at the grid points.

Side Effects

One or more plots showing location of the mixture components, classification, uncertainty, and/orclassification errors.

Details

For an image plot, a color scheme may need to be selected on the display device in order to viewthe plot.

References

See Also

mclust2Dplot , do.call

88 uncerPlot

Examples

xEMclust <- summary(EMclust(x),x)surfacePlot(x, mu = xEMclust$mu, sigma = xEMclust$sigma, pro=xEMclust$pro,

type = "contour", what = "density", transformation = "none")

## Not run: do.call("surfacePlot", c(list(data = x), xEMclust))

uncerPlot Uncertainty Plot for Model-Based Clustering

Description

Plots the uncertainty in converting a conditional probablility from EM to a classification in model-based clustering.

uncerPlot(z, truth, ...)

Arguments

z A matrix whose[i,k] th entry is the conditional probability of the ith observationbelonging to thekth component of the mixture.

truth A numeric or character vector giving the true classification of the data.

Details

Whentruth is provided and the number of classes is compatible withz , the functioncompareClassis used to to find best correspondence between classes intruth andz .

A plot of the uncertainty profile of the data, with uncertainties in increasing order of magnitude.If truth is supplied and the number of classes is the same as the number of columns ofz , theuncertainty of the misclassified data is marked by vertical lines on the plot.

References

C. Fraley and A. E. Raftery (2002b). MCLUST: Software for model-based clustering, densityestimation and discriminant analysis. Technical Report, Department of Statistics, University ofWashington. Seehttp://www.stat.washington.edu/mclust .

unmap 89

See Also

EMclust , em, me, mapClass

Examples

irisBic <- EMclust(irisMatrix)irisSumry3 <- summary(irisBic, irisMatrix, G = 3)

uncerPlot(z = irisSumry3$z)

uncerPlot(z = irisSumry3$z, truth = rep(1:3, rep(50,3)))

do.call("uncerPlot", c(irisSumry3, list(truth = rep(1:3, rep(50,3)))))

unmap Indicator Variables given Classification

Description

Converts a classification into a matrix of indicator variables.

unmap(classification, noise, ...)

Arguments

classificationA numeric or character vector. Typically the distinct entries of this vector wouldrepresent a classification of observations in a data set.

noise A single numeric or character value used to indicate observations correspondingto noise.

An n by m matrix of (0,1) indicator variables, wheren is the length ofclassification andm is the number of unique values or symbols inclassification . Columns are labeled by theunique values inclassification , and the[i,j] th entry is1 if classification[i] isthejth unique value or symbol in order of appearance in theclassification . If a noise valueof symbol is designated, the corresponding indicator variables are located in the last column of thematrix.

90 unmap

References

See Also

map, estep , me

Examples

z <- unmap(irisClass)z

emEst <- me(modelName = "VVV", data = irisMatrix, z = z)emEst$z

map(emEst$z)

∗Topic clusterbic , 8bicE , 10bicEMtrain , 11cdens , 12cdensE , 15classError , 18clPairs , 17compareClass , 19coordProj , 20cv1EMtrain , 22decomp2sigma , 23Defaults.Mclust , 1dens , 24density , 26em, 28EMclust , 3EMclustN , 5emE, 31estep , 33estepE , 36grid1 , 38hc , 39hcE, 40hclass , 42hypvol , 43map, 44mapClass , 45Mclust , 7mclust1Dplot , 46mclust2Dplot , 48mclustDA , 50mclustDAtest , 52mclustDAtrain , 53mclustOptions , 54me, 56meE, 58mstep , 60mstepE , 62mvn, 64mvnX, 65partconv , 66partuniq , 67

plot.Mclust , 68plot.mclustDA , 69randProj , 70sigma2decomp , 72sim , 74simE , 76spinProj , 77summary.EMclust , 80summary.EMclustN , 81summary.Mclust , 82summary.mclustDAtest , 82summary.mclustDAtrain , 83surfacePlot , 84uncerPlot , 86unmap, 88

∗Topic datasetschevron , 17diabetes , 28lansing , 43

∗Topic distributiondensity , 26

∗Topic internalmclust-internal , 46

∗Topic smoothdensity , 26

.Mclust , 56

.Mclust (Defaults.Mclust ), 1[.EMclust (mclust-internal ), 46[.EMclustN (mclust-internal ), 46[.mclustDAtest (mclust-internal ),

bic , 8, 11bicE , 9, 10bicEEE (bicE ), 10bicEEI (bicE ), 10bicEEV (bicE ), 10bicEII (bicE ), 10bicEMtrain , 11, 22bicEVI (bicE ), 10bicV (bicE ), 10bicVEI (bicE ), 10bicVEV (bicE ), 10bicVII (bicE ), 10

92 INDEX

bicVVI (bicE ), 10bicVVV , 9bicVVV (bicE ), 10bw.nrd , 27

cdens , 12, 15, 16, 25, 29, 31, 32, 34, 36, 74,76

cdensE , 14, 15cdensEEE (cdensE ), 15cdensEEI (cdensE ), 15cdensEEV (cdensE ), 15cdensEII (cdensE ), 15cdensEVI (cdensE ), 15cdensV (cdensE ), 15cdensVEI (cdensE ), 15cdensVEV (cdensE ), 15cdensVII (cdensE ), 15cdensVVI (cdensE ), 15cdensVVV , 14cdensVVV (cdensE ), 15charconv (mclust-internal ), 46chevron , 17classError , 18, 20, 45, 51classErrors (classError ), 18clPairs , 17, 21, 47, 49compareClass , 19, 19, 51coordProj , 18, 20, 47, 49, 72, 79cv1EMtrain , 12, 22

decomp2sigma , 23, 73Defaults.Mclust , 1dens , 14, 16, 24, 38, 44density , 26, 27diabetes , 28do.call , 9, 11, 14, 16, 21, 25, 30, 33, 35, 37,

47, 49, 72, 75, 77, 79, 86

em, 2, 28, 33, 35, 37, 45, 58, 60, 87EMclust , 2, 3, 6, 8, 9, 11, 14, 16, 54, 75, 77,

80, 87EMclustN , 4, 5, 81emE, 30, 31emEEE(emE), 31emEEI (emE), 31emEEV(emE), 31emEII (emE), 31emEVI (emE), 31emV(emE), 31emVEI (emE), 31emVEV(emE), 31emVII (emE), 31emVVI (emE), 31emVVV, 30

emVVV(emE), 31estep , 2, 9, 30, 33, 37, 45, 58, 60, 62, 63, 88estep2 (mclust-internal ), 46estepE , 11, 35, 36estepEEE (estepE ), 36estepEEI (estepE ), 36estepEEV (estepE ), 36estepEII (estepE ), 36estepEVI (estepE ), 36estepV (estepE ), 36estepVEI (estepE ), 36estepVEV (estepE ), 36estepVII (estepE ), 36estepVVI (estepE ), 36estepVVV , 35estepVVV (estepE ), 36

grid1 , 25, 38, 44grid2 (grid1 ), 38

hc , 4, 6, 39, 41, 42, 54hcE, 39, 40, 40, 42hcEEE (hcE), 40hcEII (hcE), 40hclass , 40, 41, 42hcV (hcE), 40hcVII (hcE), 40hcVVV, 40hcVVV (hcE), 40hist , 27hypvol , 43

lansing , 38, 43

map, 44, 88mapClass , 19, 20, 45, 45, 87Mclust , 7, 68mclust-internal , 46mclust1Dplot , 46mclust2Dplot , 21, 47, 48, 72, 79, 86mclust2DplotControl

(mclust-internal ), 46mclustDA , 50, 70mclustDAtest , 14, 51, 52, 54, 83mclustDAtrain , 2, 14, 51, 52, 53, 84mclustOptions , 2, 4, 6, 9, 11, 14, 16, 18,

21, 25, 30, 33, 35, 37, 49, 54, 54, 58,60, 62, 63, 72, 79

mclustProjControl(mclust-internal ), 46

me, 2, 4, 6, 30, 45, 56, 60, 62, 63, 87, 88meE, 58, 58meEEE(meE), 58

INDEX 93

meEEI (meE), 58meEEV(meE), 58meEII (meE), 58meEVI (meE), 58meV(meE), 58meVEI (meE), 58meVEV(meE), 58meVII (meE), 58meVVI (meE), 58meVVV, 58meVVV(meE), 58mstep , 2, 14, 16, 30, 33, 35, 37, 58, 60, 63,

64, 75mstepE , 62, 62, 66, 77mstepEEE (mstepE ), 62mstepEEI (mstepE ), 62mstepEEV (mstepE ), 62mstepEII (mstepE ), 62mstepEVI (mstepE ), 62mstepV (mstepE ), 62mstepVEI (mstepE ), 62mstepVEV (mstepE ), 62mstepVII (mstepE ), 62mstepVVI (mstepE ), 62mstepVVV, 62mstepVVV (mstepE ), 62mvn, 64, 66mvn2plot (mclust-internal ), 46mvnX, 64, 65mvnXII , 64mvnXII (mvnX), 65mvnXXI , 64mvnXXI (mvnX), 65mvnXXX, 64mvnXXX(mvnX), 65

nextPerm (mclust-internal ), 46

orth2 (mclust-internal ), 46

pairs , 18partconv , 66partuniq , 67plot.density , 27plot.EMclust (EMclust ), 3plot.EMclustN (EMclustN ), 5plot.Mclust , 8, 68plot.mclustDA , 51, 69print.density (density ), 26print.EMclust (EMclust ), 3print.EMclustN (EMclustN ), 5print.Mclust (Mclust ), 7print.mclustDA (mclustDA ), 50

print.summary.EMclust(summary.EMclust ), 80

print.summary.EMclustN(summary.EMclustN ), 81

randProj , 21, 49, 70, 79

shapeO (mclust-internal ), 46sigma2decomp , 24, 72sim , 74, 77simE , 75, 76simEEE (simE ), 76simEEI (simE ), 76simEEV (simE ), 76simEII (simE ), 76simEVI (simE ), 76simV (simE ), 76simVEI (simE ), 76simVEV (simE ), 76simVII (simE ), 76simVVI (simE ), 76simVVV, 75simVVV (simE ), 76spinProj , 49, 72, 77summary.EMclust , 4, 80summary.EMclustN , 6, 81summary.Mclust , 82summary.mclustDAtest , 52, 82summary.mclustDAtrain , 54, 83surfacePlot , 49, 84

table , 19, 20, 45traceW (mclust-internal ), 46

uncerPlot , 86unchol (mclust-internal ), 46unmap, 45, 88

vecnorm (mclust-internal ), 46

The mclust Package - CMU Statisticsbrian/724/week14/mclust.pdf · 2005-03-13 · The mclust Package...

Documents

Transcript of The mclust Package - CMU Statisticsbrian/724/week14/mclust.pdf · 2005-03-13 · The mclust Package...

Submittal Package Package

Random E ects Models for Network Data - CMU Statisticsbrian/780/bibliography/03 Latent Space Models... · Random E ects Models for Network Data Peter D. Ho 1 ... ranging from 0 to

Package Forwarding - Package and Parcel

mclust Version 4 for R: Normal Mixture Modeling for …...mclust Version 4 for R: Normal Mixture Modeling for Model-Based Clustering, Classiﬁcation, and Density Estimation Chris

Package Information : HTSOP-J8rohmfs.rohm.com/en/techdata_basic/ic/package/htsop-j8_1... · 2016-09-30 · Package Information : HTSOP-J8 1. Package Information Package Name HTSOP-J8

Package on Package (PoP) Applications, Requirements ...

New The cluster Package - CMU Statisticsbrian/724/week14/cluster.ps.pdf · 2005. 3. 13. · The cluster Package January 24, 2005 Version 1.9.7 Date 2004-11-27 Priority recommended

Package and Some Classes Declaration of Package Usage of Package Package of Java Language.

Tutorial on mixture models (2) - University College Londonucakche/presentations/cladagtutorial.pdf · Potential problems with mixture model-based clustering Using mclust (Gaussian

36-463/663: Hierarchical Linear Models - CMU Statisticsbrian/463-663/week10/19-taste of MCMC.pdf · 11/3/2016 1 36-463/663: Hierarchical Linear Models Taste of MCMC / Bayes for 3

Package ‘raster’ - The Comprehensive R Archive Networkraster/raster.pdf6 raster-package raster-package Overview of the functions in the raster package Description The raster package

ENTERTAINMENT PACKAGE CHOICE TM PACKAGE …

Package ‘DESeq2’ - Bioconductor · DESeq2-package 3 DESeq2-package DESeq2 package for differential analysis of count data Description The DESeq2 package is designed for normalization,

2 Probability Theory and Classical Statisticsbrian/463-663/week09/Chapter 02.pdf · Probability Theory and Classical Statistics Statistical inference rests on probability theory,

Seated Wedding Package - St Andrews Conservatory...Seated Wedding Package Package A Package B • • • • • • • • • • • • • • • • • • • o o o •

mclust Version 4 for R: Normal Mixture Modeling for Model ... · This manuscript describes Version 4 of mclust for R, with added functionality for displaying and vi-sualizing the

passivhaus & retrofit - an introduction · Kitchen and Utility Units (+ appliances) £ 3,000.00 £ ... Works Package 1 Works Package 2 Works Package 3 Works Package 4 Works Package

Package Travel Fryderyk Zoll. Package Travel Directive COUNCIL DIRECTIVE of 13 June 1990 on package travel, package holidays and package tours (90/314/EEC)

Srinagar Tour Package | LTC package Srinagar

Journal of Educational and Behavioral Statisticsbrian/905-2009/all-papers/patz-et-al-2002-jebs.pdf · This research was supported in part by National Science Foundation grant to Junker,