Cooperative learning for multiview analysis

Daisy Yi Ding; Shuangning Li; Balasubramanian Narasimhan; Robert Tibshirani

doi:10.1073/pnas.2202113119

Cooperative learning for multiview analysis

Proc Natl Acad Sci U S A. 2022 Sep 20;119(38):e2202113119. doi: 10.1073/pnas.2202113119. Epub 2022 Sep 12.

Authors

Daisy Yi Ding¹, Shuangning Li², Balasubramanian Narasimhan^{1

2}, Robert Tibshirani^{1

2}

Affiliations

¹ Department of Biomedical Data Science, Stanford University, Stanford, CA 94305.
² Department of Statistics, Stanford University, Stanford, CA 94305.

Abstract

We propose a method for supervised learning with multiple sets of features ("views"). The multiview problem is especially important in biology and medicine, where "-omics" data, such as genomics, proteomics, and radiomics, are measured on a common set of samples. "Cooperative learning" combines the usual squared-error loss of predictions with an "agreement" penalty to encourage the predictions from different data views to agree. By varying the weight of the agreement penalty, we get a continuum of solutions that include the well-known early and late fusion approaches. Cooperative learning chooses the degree of agreement (or fusion) in an adaptive manner, using a validation set or cross-validation to estimate test set prediction error. One version of our fitting procedure is modular, where one can choose different fitting mechanisms (e.g., lasso, random forests, boosting, or neural networks) appropriate for different data views. In the setting of cooperative regularized linear regression, the method combines the lasso penalty with the agreement penalty, yielding feature sparsity. The method can be especially powerful when the different data views share some underlying relationship in their signals that can be exploited to boost the signals. We show that cooperative learning achieves higher predictive accuracy on simulated data and real multiomics examples of labor-onset prediction. By leveraging aligned signals and allowing flexible fitting mechanisms for different modalities, cooperative learning offers a powerful approach to multiomics data fusion.

Keywords: data fusion; multiomics; sparsity; supervised learning.

Publication types

Research Support, U.S. Gov't, Non-P.H.S.
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Genomics* / methods
Neural Networks, Computer*
Supervised Machine Learning*

Abstract

Publication types

MeSH terms

Grants and funding