Compound p-value statistics for multiple testing procedures

J Multivar Anal. 2014 Apr 1:126:153-166. doi: 10.1016/j.jmva.2014.01.007.

Abstract

Many multiple testing procedures make use of the p-values from the individual pairs of hypothesis tests, and are valid if the p-value statistics are independent and uniformly distributed under the null hypotheses. However, it has recently been shown that these types of multiple testing procedures are inefficient since such p-values do not depend upon all of the available data. This paper provides tools for constructing compound p-value statistics, which are those that depend upon all of the available data, but still satisfy the conditions of independence and uniformity under the null hypotheses. Several examples are provided, including a class of compound p-value statistics for testing location shifts. It is demonstrated, both analytically and through simulations, that multiple testing procedures tend to reject more false null hypotheses when applied to these compound p-values rather than the usual p-values, and at the same time still guarantee the desired type I error rate control. The compound p-values are used to analyze a real microarray data set and allow for more rejected null hypotheses.

Keywords: Empirical Bayes; False discovery rate; Microarray analysis; Multiple decision function; Multiple decision process; Multiple testing; Sample splitting; Test data; Training data.