A Simple Strategy for Identifying Conserved Features across Non-independent Omics Studies

Eric Reed; Paola Sebastiani

doi:10.1101/2023.11.22.568276

A Simple Strategy for Identifying Conserved Features across Non-independent Omics Studies

bioRxiv [Preprint]. 2023 Nov 23:2023.11.22.568276. doi: 10.1101/2023.11.22.568276.

Authors

Eric Reed¹, Paola Sebastiani^{1

2

3}

Affiliations

¹ Data Intensive Study Center, Tufts University.
² Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, MA.
³ Department of Medicine, School of Medicine, Tufts University, Boston, MA.

Abstract

False discovery is an ever-present concern in omics research, especially for burgeoning technologies with unvetted specificity of their biomolecular measurements, as such unknowns obscure the ability to characterize biologically informative features from studies performed with any single platform. Accordingly, performing replication studies of the same samples using different omics platforms is a viable strategy for identifying high-confidence molecular associations that are conserved across studies. However, an important caveat of replication studies that include the same samples is that they are inherently non-independent, leading to overestimation of conservation if studies are treated otherwise. Strategies for accounting for such inter-study dependencies have been proposed for meta-analysis methods that are devised to increase statistical power to detect molecular associations present in one-or-more studies but are not immediately suited for identifying conserved molecular associations across multiple studies. Here we present a unifying strategy for performing inter-study conservation analysis as an alternative to meta-analysis strategies for aggregating summary statistical results of shared features across complementary studies, while accounting for inter-study dependency. This method, which we refer to as "adjusted maximum p-value" (AdjMaxP), is easy to implement with both inter-study dependency and conservation estimated directly from the p-values from molecular feature-level association testing results from each study. Through simulation-based assessment we demonstrate that AdjMaxP's improved performance for accurately identifying conserved features over a related meta-analysis strategy for non-independent studies.

Publication types

Preprint

Abstract

Publication types

Grants and funding