Sparse Linear Discriminant Analysis using the Prior-Knowledge-Guided Block Covariance Matrix

Chemometr Intell Lab Syst. 2020 Nov 15:206:104142. doi: 10.1016/j.chemolab.2020.104142. Epub 2020 Aug 27.

Abstract

There are two key challenges when using a linear discriminant analysis in the high-dimensional setting, including singularity of the covariance matrix and difficulty of interpreting the resulting classifier. Although several methods have been proposed to address these problems, they focused only on identifying a parsimonious set of variables maximizing classification accuracy. However, most methods did not consider dependency between variables and efficacy of selected variables appropriately. To address these limitations, here we propose a new approach that directly estimates the sparse discriminant vector without a need of estimating the whole inverse covariance matrix, by formulating a quadratic optimization problem. Furthermore, this approach also allows to integrate external information to guide the structure of covariance matrix. We evaluated the proposed model with simulation studies. We then applied it to the transcriptomic study that aims to identify genomic markers predictive of the response to cancer immunotherapy, where the covariance matrix was constructed based on the prior knowledge available in the pathway database.

Keywords: Linear discriminant analysis; block covariance matrix; cancer immunotherapy; data integration; penalized approach.