Multi-task Learning with High-Dimensional Noisy Images

J Am Stat Assoc. 2024;119(545):650-663. doi: 10.1080/01621459.2022.2140052. Epub 2022 Nov 17.

Abstract

Recent medical imaging studies have given rise to distinct but inter-related datasets corresponding to multiple experimental tasks or longitudinal visits. Standard scalar-on-image regression models that fit each dataset separately are not equipped to leverage information across inter-related images, and existing multi-task learning approaches are compromised by the inability to account for the noise that is often observed in images. We propose a novel joint scalar-on-image regression framework involving wavelet-based image representations with grouped penalties that are designed to pool information across inter-related images for joint learning, and which explicitly accounts for noise in high-dimensional images via a projection-based approach. In the presence of non-convexity arising due to noisy images, we derive non-asymptotic error bounds under non-convex as well as convex grouped penalties, even when the number of voxels increases exponentially with sample size. A projected gradient descent algorithm is used for computation, which is shown to approximate the optimal solution via well-defined non-asymptotic optimization error bounds under noisy images. Extensive simulations and application to a motivating longitudinal Alzheimer's disease study illustrate significantly improved predictive ability and greater power to detect true signals, that are simply missed by existing methods without noise correction due to the attenuation to null phenomenon.

Keywords: High dimensional statistics; measurement error in covariates; multi-task learning; neuroimaging analysis; scalar-on-image regression.