Mixtures of t $$ t $$ factor analysers with censored responses and external covariates: An application to educational data from Peru

Br J Math Stat Psychol. 2024 May;77(2):316-336. doi: 10.1111/bmsp.12329. Epub 2023 Dec 14.

Abstract

Analysing data from educational tests allows governments to make decisions for improving the quality of life of individuals in a society. One of the key responsibilities of statisticians is to develop models that provide decision-makers with pertinent information about the latent process that educational tests seek to represent. Mixtures of t $$ t $$ factor analysers (MtFA) have emerged as a powerful device for model-based clustering and classification of high-dimensional data containing one or several groups of observations with fatter tails or anomalous outliers. This paper considers an extension of MtFA for robust clustering of censored data, referred to as the MtFAC model, by incorporating external covariates. The enhanced flexibility of including covariates in MtFAC enables cluster-specific multivariate regression analysis of dependent variables with censored responses arising from upper and/or lower detection limits of experimental equipment. An alternating expectation conditional maximization (AECM) algorithm is developed for maximum likelihood estimation of the proposed model. Two simulation experiments are conducted to examine the effectiveness of the techniques presented. Furthermore, the proposed methodology is applied to Peruvian data from the 2007 Early Grade Reading Assessment, and the results obtained from the analysis provide new insights regarding the reading skills of Peruvian students.

Keywords: AECM algorithm; censored data; factor analysis; outliers; truncated multivariate t distribution.

MeSH terms

  • Algorithms*
  • Computer Simulation
  • Humans
  • Likelihood Functions
  • Multivariate Analysis
  • Peru
  • Quality of Life*