%---------------------------------------------------------- % % Nusrat Jahan % % % % Name of this file is ece2.tex %---------------------------------------------------------- \font\bfone=cmbx10 scaled\magstep1 \font\bftwo=cmbx10 scaled\magstep2 \font\smc=cmcs11 \centerline{\bftwo REVIEW OF ICL CRITERION FOR CLUSTERING} \centerline{\bftwo IN MIXTURE MODELS} \vskip 4mm \centerline{\bfone{\it Nusrat Jahan}} \vskip 4mm \centerline{\bfone Department of Mathematics and Statistics} \centerline{\bfone Mississippi State University} \centerline{\bfone Mississippi State, MS 39762 USA} \centerline{\bfone email:njahan@ra.msstate.edu} \vskip 10mm \centerline{\bfone ABSTRACT} \vskip 4mm \noindent Statistical analysis of mixture models are of interest because it is an alternative to nonparametric density estimation and it is a powerful way of modelling in cluster analysis. In case of density estimation, optimization of the Bayesian Information criterion (BIC) generally results in a good approximation of the density to be estimated. But in case of cluster analysis, the BIC tends to overestimate the number of clusters when the data is a poor fit to the mixture model. In this context a modification of BIC, integrated completed likelihood (ICL) criterion has been investigated. In the ICL approach, the integrated completed likelihood is maximized to select both a relevant form of model and relevant number of clusters. In the BIC approach, only the observed likelihood is maximized. Where as the integrated completed likelihood includes the estimated (using maximum a posteriori function) missing data. The ICL criterion penalizes for the complexity of the mixture model, thus ensuring the partitioning of data with the greatest evidence. This paper will focus on the computation of ICL, effectiveness and drawbacks of ICL in the context of cluster analysis. The differences between ICL and BIC will also be investigated. \bye