Power and sample size estimation for the Wilcoxon rank sum test with application to comparisons of C statistics from alternative prediction models

B Rosner; R J Glynn

doi:10.1111/j.1541-0420.2008.01062.x

Power and sample size estimation for the Wilcoxon rank sum test with application to comparisons of C statistics from alternative prediction models

Biometrics. 2009 Mar;65(1):188-97. doi: 10.1111/j.1541-0420.2008.01062.x. Epub 2008 May 28.

Authors

B Rosner¹, R J Glynn

Affiliation

¹ Channing Laboratory, Harvard Medical School, Boston, Massachusetts 02115, USA. rosner@stat.harvard.edu

PMID: 18510654
DOI: 10.1111/j.1541-0420.2008.01062.x

Abstract

The Wilcoxon Mann-Whitney (WMW) U test is commonly used in nonparametric two-group comparisons when the normality of the underlying distribution is questionable. There has been some previous work on estimating power based on this procedure (Lehmann, 1998, Nonparametrics). In this article, we present an approach for estimating type II error, which is applicable to any continuous distribution, and also extend the approach to handle grouped continuous data allowing for ties. We apply these results to obtaining standard errors of the area under the receiver operating characteristic curve (AUROC) for risk-prediction rules under H(1) and for comparing AUROC between competing risk prediction rules applied to the same data set. These results are based on SAS-callable functions to evaluate the bivariate normal integral and are thus easily implemented with standard software.

Publication types

Comparative Study
Research Support, N.I.H., Extramural

MeSH terms

Area Under Curve
Biometry / methods*
Data Interpretation, Statistical
Humans
Models, Theoretical
ROC Curve
Sample Size
Statistics, Nonparametric*

Grants and funding

EY12269/EY/NEI NIH HHS/United States