Finite Mixture Models

Finite Mixture Models

Overview

Finite mixture models have been used for more than 100 years, but have seen a real boost in popularity over the last decade due to the tremendous increase in available computing power. Applications in disjoint scientific communities have led to the development of a lot of variants and extensions for special cases without proper analysis of many structural and statistical properties of the general model class.

The EM algorithm provides a unifying framework for maximum likelihood estimation of parameters. However, the identification of these models was only considered for special cases and a thorough investigation of recent extensions and variants, as, e.g., mixtures of generalized linear models, is still missing. One major goal of this project is to develop a general theory for the identification of mixture models in a top-down approach.

In addition to the theoretical investigations we develop an open-source reference implementation within R, an environment for statistical computing and graphics. State of the art estimation techniques will be made available through a uniform and convenient user interface. Automatic model selection, diagnostic tools and checking of identifiability constraints for a specified model class and a given data set will be implemented, all of which are almost completely missing in existing software packages. The ultimate goal is a comprehensive methodological and computational toolbox for identification and estimation of finite mixture models.

AASC Project Members

R Packages

  • FlexMix: Flexible Mixture Modeling
    A general framework for finite mixtures of regression models using the EM algorithm. FlexMix provides the E-step and all data handling, while the M-step can be supplied by the user to easily define new models. Existing drivers implement mixtures of standard linear models, generalized linear models, and model-based clustering.
  • BayesMix Bayesian Mixture Models with JAGS

Funding

  • Austrian Research Foundation (FWF)
    • Stand-alone Project "Identification and Estimation of Finite Mixture Models" (P17382, 2004-2008)
    • Hertha-Firnberg Project "Modelling Unobserved Heterogeneity with Mixtures" (T351, 2007-2010)
  • Austrian Academy of Sciences (ÖAW)
    • DOC-FFORTE scholarship (2005-2006)

Publications

  • Bettina Grün and Friedrich Leisch. Dealing with label switching in mixture models under genuine multimodality. Journal of Multivariate Analysis, 100(5):851-861, May 2009. [ bib | DOI | preprint ]
  • Bettina Grün and Friedrich Leisch. Identifiability of finite mixtures of multinomial logit models with varying and fixed effects. Journal of Classification, 25(2):225-247, November 2008. [ bib | DOI | preprint ]
  • Bettina Grün and Friedrich Leisch. Flexmix version 2: Finite mixtures with concomitant variables and varying and constant parameters. Journal of Statistical Software, 28(4):1-35, September 2008. [ bib | http ]
  • Bettina Grün. Fitting finite mixtures of linear mixed models with the EM algorithm. In Paula Brito, editor, Compstat 2008-Proceedings in Computational Statistics, volume II, pages 165-173. Physica Verlag, Heidelberg, Germany, 2008. [ bib ]
  • Bettina Grün and Friedrich Leisch. Finite mixtures of generalized linear regression models. In Shalabh and Christian Heumann, editors, Recent Advances In Linear Models and Related Areas, pages 205-230. Springer, 2008. [ bib | DOI | preprint ]
  • Friedrich Leisch. Visualizing cluster analysis and finite mixture models. In Chun houh Chen, Wolfgang Härdle, and Antony Unwin, editors, Handbook of Data Visualization, Springer Handbooks of Computational Statistics. Springer Verlag, 2008. [ bib ]
  • Bettina Grün and Friedrich Leisch. Fitting Finite Mixtures of Generalized Linear Regressions in R. Computational Statistics and Data Analysis, 51(11), 5247-5252, 2007. [ bib | .pdf ]
  • Bettina Grün and Friedrich Leisch. FlexMix: An R package for finite mixture modelling. R News, 7(1), 8-13, 2007. [ bib | .pdf ]
  • Friedrich Leisch and Bettina Grün. Extending standard cluster algorithms to allow for group constraints. In Alfredo Rizzi and Maurizio Vichi, editors, Compstat 2006-Proceedings in Computational Statistics, pages 885-892. Physica Verlag, Heidelberg, Germany, 2006. [ bib | .pdf ]
  • Bettina Grün and Friedrich Leisch. Fitting finite mixtures of linear regression models with varying & fixed effects in R. In Alfredo Rizzi and Maurizio Vichi, editors, Compstat 2006-Proceedings in Computational Statistics, pages 853-860. Physica Verlag, Heidelberg, Germany, 2006. [ bib | .pdf ]
  • Bettina Grün and Friedrich Leisch. Finite mixture model diagnostics using the parametric bootstrap. In Wilfried Elmenreich and Hans Kaiser, editors, Proceedings of the Junior Scientist Conference 2006, pages 301-302, Vienna, Austria, April 2006. Vienna University of Technology. [ bib | .pdf ]
  • Friedrich Leisch. FlexMix: A general framework for finite mixture models and latent class regression in R. Journal of Statistical Software, 11(8), 2004. [ bib | http ]
  • Bettina Grün and Friedrich Leisch. Bootstrapping finite mixture models. In Compstat 2004 - Proceedings in Computational Statistics, pages 1115-1122. Physika Verlag, Heidelberg, Germany, 2004. ISBN 3-7908-1554-3. [ bib | .pdf ]
  • Friedrich Leisch. Exploring the structure of mixture model components. In Jaromir Antoch, editor, Compstat 2004 - Proceedings in Computational Statistics, pages 1405-1412. Physika Verlag, Heidelberg, Germany, 2004. ISBN 3-7908-1554-3. [ bib | .pdf ]