Compound p-value statistics for multiple testing procedures

Joshua D Habiger; Edsel A Peña

doi:10.1016/j.jmva.2014.01.007

Compound p-value statistics for multiple testing procedures

J Multivar Anal. 2014 Apr 1:126:153-166. doi: 10.1016/j.jmva.2014.01.007.

Authors

Joshua D Habiger¹, Edsel A Peña²

Affiliations

¹ Department of Statistics, Oklahoma State University, 301 MSCS building, Stillwater, OK, 74078, United States.
² Department of Statistics, University of South Carolina, 216 LeConte College, Columbia, SC, 29208, United States.

Abstract

Many multiple testing procedures make use of the p-values from the individual pairs of hypothesis tests, and are valid if the p-value statistics are independent and uniformly distributed under the null hypotheses. However, it has recently been shown that these types of multiple testing procedures are inefficient since such p-values do not depend upon all of the available data. This paper provides tools for constructing compound p-value statistics, which are those that depend upon all of the available data, but still satisfy the conditions of independence and uniformity under the null hypotheses. Several examples are provided, including a class of compound p-value statistics for testing location shifts. It is demonstrated, both analytically and through simulations, that multiple testing procedures tend to reject more false null hypotheses when applied to these compound p-values rather than the usual p-values, and at the same time still guarantee the desired type I error rate control. The compound p-values are used to analyze a real microarray data set and allow for more rejected null hypotheses.

Keywords: Empirical Bayes; False discovery rate; Microarray analysis; Multiple decision function; Multiple decision process; Multiple testing; Sample splitting; Test data; Training data.

Abstract

Grants and funding